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CODING SEQUENCE POLYMORPHISMS IN 
VASCULAR PATHOLOGY GENES 

RELATED APPLICATIONS 

This application is a Continuation-in-Part of U.S. Application No. 09/054,272, 
5 filed April 1, 1998, the contents of which are incorporated herein in their entirety by 
reference. 

BACKGROUND OF THE INVENTION 

The genomes of all organisms undergo spontaneous mutation in the course of 
their continuing evolution, generating variant forms of progenitor sequences (Gusella, 

10 Ann. Rev. Biochem. 55, 831-854 (1986)). The variant form may confer an 
evolutionary advantage or disadvantage relative to a progenitor form or may be 
neutral. In some instances, a variant form confers a lethal disadvantage and is not 
transmitted to subsequent generations of the organism. In other instances, a variant 
form confers an evolutionary advantage to the species and is eventually incorporated 

15 into the DNA of many or most members of the species and effectively becomes the 
progenitor form. In many instances, both progenitor and variant form(s) survive and 
co-exist in a species population. The coexistence of multiple forms of a sequence 
gives rise to polymorphisms. 

Several different types of polymorphism have been reported. A restriction 

20 fragment length polymorphism (RFLP) Is a variation in DNA sequence that alters the 
length of a restriction fragment (Botstein et a/., Am. 1 Hum. Genet. 32, 314-331 
(1980)). The restriction fragment length polymorphism may create or delete a 
restriction site, thus changing the length of the restriction fragment. RFLPs have been 
widely used in human and animal genetic analyses (see WO 90/13668; W090/1 1369; 

25 Donis-Keller, Cell 51, 319-337 (1987); Landers a/. Genetics 121, 85-99 (1989)). 
When a heritable trait can be linked to a particular RFLP, the presence of the RFLP in 
an individual can be used to predict the likelihood that the animal will also exhibit the 
trait. 

Other polymorphisms take the form of short tandem repeats (STRs) that 
30 include tandem di-, tri- and tetra-nucleotide repeated motifs. These tandem repeats 
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are also referred to as variable number tandem repeat (VNTR) polymorphisms. 
VNTRs have been used in identity and paternity analysis (US 5,075,21 7; Armour et 
aUFEBSLett. 307, 113-115(1992); Horn et al y WO 91/14003; Jeffreys, EP 
370,719), and in a large number of genetic mapping studies. 
5 Other polymorphisms take the form of single nucleotide variations between 

individuals of the same species. Such polymorphisms are far more frequent than 
RFLPs, STRs and VNTRs. Some single nucleotide polymorphisms (SNP) occur in 
protein-coding sequences (coding sequence SNP (cSNP)), in which case, one of the 
polymorphic forms may give rise to the expression of a defective or otherwise variant 

10 protein and, potentially, a genetic disease. Examples of genes in which 

polymorphisms within coding sequences give rise to genetic disease include p-globin 
(sickle cell anemia), apoE4 (Alzheimer's Disease), Factor V Leiden (thrombosis), and 
CFTR (cystic fibrosis). cSNPs can alter the codon sequence of the gene and therefore 
specify an alternative amino acid. Such changes are called "missense" when another 

15 amino acid is substituted, and "nonsense" when the alternative codon specifies a stop 
signal in protein translation. When the cSNP does not alter the amino acid specified 
the cSNP is called "silent". 

Other single nucleotide polymorphisms occur in noncoding regions. Some of 
these polymorphisms may also result in defective protein expression (e.g., as a result 

20 of defective splicing). Other single nucleotide polymorphisms have no phenotypic 
effects. 

Single nucleotide polymorphisms can be used in the same manner as RFLPs 
and VNTRs, but offer several advantages. Single nucleotide polymorphisms occur 
with greater frequency and are spaced more uniformly throughout the genome than 

25 other forms of polymorphism. The greater frequency and uniformity of single 
nucleotide polymorphisms means that there is a greater probability that such a 
polymorphism will be found in close proximity to a genetic locus of interest than 
would be the case for other polymorphisms. The different forms of characterized 
single nucleotide polymorphisms are often easier to distinguish than other types of 

30 polymorphism (e.g., by use of assays employing allele-specific hybridization probes 
or primers). 

Only a small percentage of the total repository of polymorphisms in humans 
and other organisms has been identified. The limited number of polymorphisms 
identified to date is due to the large amount of work required for their detection by 
35 conventional methods. For example, a conventional approach to identifying 
polymorphisms might be to sequence the same stretch of DNA in a population of 
individuals by dideoxy sequencing. In this type of approach, the amount of work 
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increases in proportion to both the length of sequence and the number of individuals 
in a population and becomes impractical for large stretches of DNA or large numbers 
of persons. 

SUMMARY OF THE INVENTION 
5 Work described herein pertains to the identification of polymorphisms which 

can predispose individuals to disease, particularly vascular pathologies, by 
resequencing large numbers of genes in a large number of individuals. Eighteen 
genes in a minimum of 30 individuals have been resequenced as described herein, and 
92 SNPs have been discovered (see the Table). Forty of these SNPs are cSNPs which 

10 specify a different amino acid sequence, while 49 of the SNPs are silent cSNPs. 
Three of the SNPs were located in non-coding regions. 

The invention relates to a gene which comprises a single nucleotide 
polymorphism at a specific location. In a particular embodiment the invention relates 
to the variant allele of a gene having a single nucleotide polymorphism, which variant 

15 allele differs from a reference allele by one nucleotide at the site(s) identified in the 
Table. Complements of these nucleic acid segments are also included. The segments 
can be DNA or RNA, and can be double- or single-stranded. Segments can be, for 
example, 5-10, 5-15, 10-20, 5-25, 10-30, 10-50 or 10-100 bases long. 

The invention further provides allele-specific oligonucleotides that hybridize 

20 to a gene comprising a single nucleotide polymorphism or to the complement of the 
gene. These oligonucleotides can be probes or primers. 

The invention further provides a method of analyzing a nucleic acid from an 
individual. The method determines which base is present at any one of the 
polymorphic sites shown in the Table. Optionally, a set of bases occupying a set of 

25 the polymorphic sites shown in the Table is determined. This type of analysis can be 
performed on a number of individuals, who are tested for the presence of a disease 
phenotype. The presence or absence of disease phenotype is then correlated with a 
base or set of bases present at the polymorphic site or sites in the individuals tested. 

BRIEF DESCRIPTION OF THE DRAWINGS 
30 Figures 1 A-1C are a table illustrating the locations of single nucleotide 

polymorphisms of various genes. 

Figure 2 is a listing of the genes from Figures 1 A-C with their corresponding 
GenBank Accession numbers and the nucleotide position within that sequence at 
which the single nucleotide polymorphism is located. 
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Figures 3A-B are a listing of the nucleotide sequence corresponding to 
GenBank Accession number D10202 for the gene PTAFR. 

Figures 4A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number D29832 for the gene AT3. 
5 Figures 5A-C are a listing of the nucleotide sequence corresponding to the 

GenBank Accession number D38081 for the gene TBXA2R. 

Figures 6A-C are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number J02703 for the gene ITGB3. 

Figures 7A-C are a listing of the nucleotide sequence corresponding to the 
10 GenBank Accession number J02764 for the gene ITGA2B. 

Figures 8A-F are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number J02846 for the gene F3. 

Figures 9A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number J02898 for the gene CETP. 
1 5 Figures 1 0 A-B are a listing of the nucleotide sequence corresponding to the 

GenBank Accession number J03225 for the gene TFPI. 

Figures 1 1 A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number K02059 for the gene PROC. 

Figure 12 is a listing of the nucleotide sequence corresponding to the GenBank 
20 Accession number L00336 for the gene LDLR. 

Figure 13 is a listing of the nucleotide sequence corresponding to the GenBank 
Accession number L00338. 

Figure 14 is a listing of the nucleotide sequence corresponding to the GenBank 
Accession number L00343 for the gene LDLR. 
25 Figure 15 is a listing of the nucleotide sequence corresponding to the GenBank 

Accession number L00344 for the gene LDLR. 

Figure 16 is a listing of the nucleotide sequence corresponding to the GenBank 
Accession number L00345 for the gene LDLR. 

Figure 17 is a listing of the nucleotide sequence corresponding to the GenBank 
30 Accession number L00347 for the gene LDLR. 

Figure 18 is a listing of the nucleotide sequence corresponding to the GenBank 
Accession number L00349 for the gene LDLR. 

Figures 19A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number L0035 1 for the gene LDLR. 
35 Figures 20 A-B are a listing of the nucleotide sequence corresponding to the 

GenBank Accession number L29401 for the gene LDLR. 
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Figures 21 A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number L32765 for the gene F5. 

Figures 22A-C are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number Ml 1058 for the gene HMGCR. 
5 Figures 23 A-F are a listing of the nucleotide sequence corresponding to the 

GenBank Accession number Ml 1228 for the gene PROC. 

Figures 24A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number M12625 for the gene LCAT. 

Figures 25A-C are a listing of the nucleotide sequence corresponding to the 
1 0 GenBank Accession number M 1 2849 for the gene HCF2. 

Figures 26A-E are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number M14335 for the gene F5. 

Figures 27A-C are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number Ml 5856 for the gene LPL. 
15 Figures 28A-N are a listing of the nucleotide sequence corresponding to the 

GenBank Accession number Ml 7262 for the gene F2. 

Figures 29A-C are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number M2031 1 for the gene ITGB3. 

Figure 30 is a listing of the nucleotide sequence corresponding to the GenBank 
20 Accession number M21645 for the gene AT3. 

Figures 31 A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number M22569 for the gene ITGA2B. 

Figures 32A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number M30185 for the gene CETP. 
25 Figures 33 A-H are a listing of the nucleotide sequence corresponding to the 

GenBank Accession number M33320 for the gene ITGA2B. 

Figures 34A-G are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number M58600 for the gene HCF2. 

Figures 35A-B are a listing of the nucleotide sequence corresponding to the 
30 GenBank Accession number M62424 for the gene F2R. 

Figures 36A-C are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number M76722 for the gene LPL. 

Figures 37A-B are a listing of the nucleotide sequence corresponding to the 
GenBank Accession number U59436 for the gene LDLR. 
35 Figures 38A-B are a listing of the nucleotide sequence corresponding to the 

GenBank Accession number Z22555 for the gene CLanalog. 
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DET AILED DESCRIPTION OF THE INVENTION 

The present invention relates to a gene which comprises a single nucleotide 
polymorphism (SNP) at a specific location. The gene which includes the SNP has at 
least two alleles, referred to herein as the reference allele and the variant allele. The 
5 reference allele (prototypical or wild type allele) has been designated arbitrarily and 
typically corresponds to the nucleotide sequence of the gene which has been deposited 
with GenBank under a given Accession number. The variant allele differs from the 
reference allele by one nucleotide at the site(s) identified in the Table. The present 
invention also relates to variant alleles of the described genes and to complements of 

10 the variant alleles. The invention further relates to portions of the variant alleles and 
portions of complements of the variant alleles which comprise (encompass) the site of 
the SNP and are at least 5 nucleotides in length. Portions can be, for example, 5-10, 
5-15, 10-20,5-25, 10-30, 10-50 or 10-100 bases long. For example, a portion of a 
variant allele which is 5 nucleotides in length includes the single nucleotide 

15 polymorphism (the nucleotide which differs from the reference allele at that site) and 
four additional nucleotides which flank the site in the variant allele. These 
nucleotides can be on one or both sides of the polymorphism. Polymorphisms which 
are the subject of this invention are defined in the Table with respect to the reference 
sequence deposited in GenBank under the Accession number indicated. For example, 

20 the invention relates to a portion of a gene (e.g., AT3) having a nucleotide sequence 
as deposited in GenBank (e.g., M21645) comprising a single nucleotide 
polymorphism at a specific position (e.g., nucleotide 100). The reference allele for 
AT3 is shown in column 15 and the variant allele is shown in column 17 of the Table. 
The nucleotide sequences of the invention can be double- or single-stranded. 

25 The invention further provides allele-specific oligonucleotides that hybridize 

to a gene comprising a single nucleotide polymorphism or to the complement of the 
gene. These oligonucleotides can be probes or primers. 

The invention further provides a method of analyzing a nucleic acid from an 
individual. The method determines which base is present at any one of the 

30 polymorphic sites shown in the Table. Optionally, a set of bases occupying a set of 
the polymorphic sites shown in the Table is determined. This type of analysis can be 
performed on a number of individuals, who are tested for the presence of a disease 
phenotype. The presence or absence of disease phenotype is then correlated with a 
base or set of bases present at the polymorphic site or sites in the individuals tested. 
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DEFINITIONS 

An oligonucleotide can be DNA or RNA, and single- or double-stranded. 
Oligonucleotides can be naturally occurring or synthetic, but are typically prepared by 
synthetic means. Preferred oligonucleotides of the invention include segments of 
5 DNA, or their complements, which include any one of the polymorphic sites shown in 
the Table. The segments can be between 5 and 250 bases, and, in specific 
embodiments, are between 5-10, 5-20, 10-20, 10-50, 20-50 or 10-100 bases. The 
polymorphic site can occur within any position of the segment. The segments can be 
from any of the allelic forms of DNA shown in the Table. 

10 As used herein, the terms "nucleotide" and "nucleic acid" are intended to be 

equivalent. The terms "nucleotide sequence", "nucleic acid sequence", "nucleic acid 
molecule" and "segment" are intended to be equivalent. 

Hybridization probes are oligonucleotides which bind in a base-specific 
manner to a complementary strand of nucleic acid. Such probes include peptide 

15 nucleic acids, as described in Nielsen et aL, Science 254, 1497-1500 (1991). Probes 
can be any length suitable for specific hybridization to the target nucleic acid 
sequence. The most appropriate length of the probe may vary depending upon the 
hybridization method in which it is being used; for example, particular lengths may be 
more appropriate for use in microfabricated arrays, while other lengths may be more 

20 suitable for use in classical hybridization methods. Suitable probes and primers can 
range from about 5 nucleotides to about 30 nucleotides in length. For example, 
probes and primers can be 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 25, 26, 28 or 30 
nucleotides in length. The probe or primer preferably contains at least one 
polymorphic site occupied by any of the possible variant nucleotides. The nucleotide 

25 sequence can correspond to the coding sequence of the allele or to the complement of 
the coding sequence of the allele. 

As used herein, the term "primer" refers to a single-stranded oligonucleotide 
which acts as a point of initiation of template-directed DNA synthesis under 
appropriate conditions (e.g. 9 in the presence of four different nucleoside triphosphates 

30 and an agent for polymerization, such as, DNA or RNA polymerase or reverse 

transcriptase) in an appropriate buffer and at a suitable temperature. The appropriate 
length of a primer depends on the intended use of the primer, but typically ranges 
from 15 to 30 nucleotides. Short primer molecules generally require cooler 
temperatures to form sufficiently stable hybrid complexes with the template. A 

35 primer need not reflect the exact sequence of the template, but must be sufficiently 
complementary to hybridize with a template. The term primer site refers to the area 
of the target DNA to which a primer hybridizes. The term primer pair refers to a set 
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of primers including a 5' (upstream) primer that hybridizes with the 5' end of the DNA 
sequence to be amplified and a 3 ! (downstream) primer that hybridizes with the 
complement of the 3* end of the sequence to be amplified. 

As used herein, linkage describes the tendency of genes, alleles, loci or genetic 
5 markers to be inherited together as a result of their location on the same chromosome. 
It can be measured by percent recombination between the two genes, alleles, loci or 
genetic markers. 

As used herein, polymorphism refers to the occurrence of two or more 
genetically determined alternative sequences or alleles in a population. A 

10 polymorphic marker or site is the locus at which divergence occurs. Preferred 

markers have at least two alleles, each occurring at frequency of greater than 1%, and 
more preferably greater than 10% or 20% of a selected population. A polymorphic 
locus may be as small as one base pair. Polymorphic markers include restriction 
fragment length polymorphisms, variable number of tandem repeats (VNTR ! s), 

15 hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, 
tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu. 
The first identified allelic form is arbitrarily designated as the reference form and 
other allelic forms are designated as alternative or variant alleles. The allelic form 
occurring most frequently in a selected population is sometimes referred to as the 

20 wildtype form. Diploid organisms may be homozygous or heterozygous for allelic 
forms. A diallelic or biallelic polymorphism has two forms. A triallelic 
polymorphism has three forms. 

Work described herein pertains to the resequencing of large numbers of genes 
in a large number of individuals to identify polymorphisms which can predispose 

25 individuals to disease, particularly vascular pathologies. Eighteen genes in a 

minimum of 30 individuals have been resequenced as described herein, and 92 SNPs 
have been discovered (see the Table). Forty of these SNPs are cSNPs which specify a 
different amino acid sequence, while 49 of the SNPs are silent cSNPs. Three of the 
SNPs were located in non-coding regions. 

30 The 18 genes which were subjected to analysis encode proteins that are 

involved in biochemical pathways that regulate blood coagulation, lipid metabolism, 
and platelet and endothelial cell function. Polymorphisms in all 18 genes are 
candidates for genetic factors that influence the pathophysiology of the blood and 
blood vessels and thus can be relevant to the genetic risk of cardiovascular diseases. 

35 The identified polymorphisms can also be relevant to other disease categories. 

By altering amino acid sequence, SNPs may alter the function of the encoded 
proteins. The discovery of the SNP facilitates biochemical analysis of the variants 
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and the development of assays to characterize the variants and to screen for 
pharmaceutical that would interact directly with on or another form of the protein. 
SNPs (including silent SNPs) may also alter the regulation of the gene at the 
transcriptional or post-transcriptional level. SNPs (including silent SNPs) also enable 
5 the development of specific DNA, RNA, or protein-based diagnostics that detect the 
presence or absence of the polymorphism in particular conditions. 

A single nucleotide polymorphism occurs at a polymorphic site occupied by a 
single nucleotide, which is the site of variation between allelic sequences. The site is 
usually preceded by and followed by highly conserved sequences of the allele (e.g., 

10 sequences that vary in less than 1/100 or 1/1000 members of the populations). 

A single nucleotide polymorphism usually arises due to substitution of one 
nucleotide for another at the polymorphic site. A transition is the replacement of one 
purine by another purine or one pyrimidine by another pyrimidine. A transversion is 
the replacement of a purine by a pyrimidine or vice versa. Single nucleotide 

15 polymorphisms can also arise from a deletion of a nucleotide or an insertion of a 
nucleotide relative to a reference allele. Typically the polymorphic site is occupied by 
a base other than the reference base. For example, where the reference allele contains 
the base "T" at the polymorphic site, the altered allele can contain a "C", "G" or "A" at 
the polymorphic site. 

20 Hybridizations are usually performed under stringent conditions, for example, 

at a salt concentration of no more than 1 M and a temperature of at least 25°C. For 
example, conditions of 5X SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, 
pH 7.4) and a temperature of 25-30°C, or equivalent conditions, are suitable for 
allele-specific probe hybridizations. Equivalent conditions can be determined by 

25 varying one or more of the parameters given as an example, as known in the art, while 
maintaining a similar degree of identity or similarity between the target nucleotide 
sequence and the primer or probe used. 

The term "isolated" is used herein to indicate that the material in question 
exists in a physical milieu distinct from that in which it occurs in nature. For 

30 example, an isolated nucleic acid of the invention may be substantially isolated with 
respect to the complex cellular milieu in which it naturally occurs. In some instances, 
the isolated material will form part of a composition (for example, a crude extract 
containing other substances), buffer system or reagent mix. In other circumstance, the 
material may be purified to essential homogeneity, for example as determined by 

35 PAGE or column chromatography such as HPLC. Preferably, an isolated nucleic acid 
comprises at least about 50, 80 or 90 percent (on a molar basis) of all macromolecular 
species present. 
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I. Novel Polymorphisms of the Invention 

The novel polymorphisms of the invention are shown in the Table. 

II. Analysis of Polymorphisms 

A. Preparation of Samples 

5 Polymorphisms are detected in a target nucleic acid from an individual being 

analyzed. For assay of genomic DNA, virtually any biological sample (other than 
pure red blood cells) is suitable. For example, convenient tissue samples include 
whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair. 
For assay of cDNA or mRNA, the tissue sample must be obtained from an organ in 

10 which the target nucleic acid is expressed. For example, if the target nucleic acid is a 
cytochrome P450, the liver is a suitable source. 

Many of the methods described below require amplification of DNA from 
target samples. This can be accomplished by e.g., PCR. See generally PCR 
Technology: Principles and Applications for DNA Amplification (ed. H.A. Erlich, 

15 Freeman Press, NY, NY, 1992); PCR Protocols: A Guide to Methods and 

Applications (eds. Innis, et aL 9 Academic Press, San Diego, CA, 1990); Mattila et a/., 
Nucleic Acids Res. 19, 4967 (1991); Eckert et al y PCR Methods and Applications 1, 
17 (1991); PCR (eds. McPherson et aL 9 IRL Press, Oxford); and U.S. Patent 
4,683,202. 

20 Other suitable amplification methods include the ligase chain reaction (LCR) 

(see Wu and Wallace, Genomics 4, 560 (1989), Landegren et al., Science 241, 1077 

(1988) , transcription amplification (Kwoh et a/., Proa Natl Acad. ScL USA 86, 1173 

(1989) ), and self-sustained sequence replication (Guatelli et a/., Proc. Nat. Acad. ScL 
USA, 87, 1874 (1990)) and nucleic acid based sequence amplification (NASBA). The 

25 latter two amplification methods involve isothermal reactions based on isothermal 
transcription, which produce both single stranded UNA (ssRNA) and double stranded 
DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, 
respectively. 

B. Detection of Polymorphisms in Target DNA 

30 There are two distinct types of analysis of target DNA for detecting 

polymorphisms. The first type of analysis, sometimes referred to as de novo 
characterization, is carried out to identify polymorphic sites not previously 
characterized (i.e., to identify new polymorphisms). This analysis compares target 
sequences in different individuals to identify points of variation, i.e., polymorphic 

35 sites. By analyzing groups of individuals representing the greatest ethnic diversity 
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among humans and greatest breed and species variety in plants and animals, patterns 
characteristic of the most common alleles/haplotypes of the locus can be identified, 
and the frequencies of such alleles/haplotypes in the population can be determined. 
Additional allelic frequencies can be determined for subpopulations characterized by 
5 criteria such as geography, race, or gender. The de novo identification of 

polymorphisms of the invention is described in the Examples section. The second 
type of analysis determines which form(s) of a characterized (known) polymorphism 
are present in individuals under test. There are a variety of suitable procedures, which 
are discussed in turn. 

10 1. Allele-Specific Probes 

The design and use of allele-specific probes for analyzing polymorphisms is 
described by e.g., Saiki et al y Nature 324, 163-166 (1986); Dattagupta, EP 235,726, 
Saiki, WO 89/1 1548. Allele-specific probes can be designed that hybridize to a 
segment of target DNA from one individual but do not hybridize to the corresponding 

15 segment from another individual due to the presence of different polymorphic forms 
in the respective segments from the two individuals. Hybridization conditions should 
be sufficiently stringent that there is a significant difference in hybridization intensity 
between alleles, and preferably an essentially binary response, whereby a probe 
hybridizes to only one of the alleles. Some probes are designed to hybridize to a 

20 segment of target DNA such that the polymorphic site aligns with a central position 
(e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 8 or 9 position) of the 
probe. This design of probe achieves good discrimination in hybridization between 
different allelic forms. 

Allele-specific probes are often used in pairs, one member of a pair showing a 

25 perfect match to a reference form of a target sequence and the other member showing 
a perfect match to a variant form. Several pairs of probes can then be immobilized on 
the same support for simultaneous analysis of multiple polymorphisms within the 
same target sequence. 

2. Tiling Arrays 

30 The polymorphisms can also be identified by hybridization to nucleic acid 

arrays, some examples of which are described in WO 95/1 1995. One form of such 
arrays is described in the Examples section in connection with de novo identification 
of polymorphisms. The same array or a different array can be used for analysis of 
characterized polymorphisms. WO 95/1 1995 also describes subarrays that are 

35 optimized for detection of a variant form of a precharacterized polymorphism. Such a 
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subarray contains probes designed to be complementary to a second reference 
sequence, which is an allelic variant of the first reference sequence. The second group 
of probes is designed by the same principles as described in the Examples, except that 
the probes exhibit complementarity to the second reference sequence. The inclusion 
5 of a second group (or further groups) can be particularly useful for analyzing short 
subsequences of the primary reference sequence in which multiple mutations are 
expected to occur within a short distance commensurate with the length of the probes 
(e.g., two or more mutations within 9 to 21 bases). 

3. Allele-Specific Primers 

1 0 An allele-specific primer hybridizes to a site on target DNA overlapping a 

polymorphism and only primes amplification of an allelic form to which the primer 
exhibits perfect complementarity. See Gibbs, Nucleic Acid Res. 1 7, 2427-2448 
(1989). This primer is used in conjunction with a second primer which hybridizes at a 
distal site. Amplification proceeds from the two primers, resulting in a detectable 

15 product which indicates the particular allelic form is present. A control is usually 
performed with a second pair of primers, one of which shows a single base mismatch 
at the polymorphic site and the other of which exhibits perfect complementarity to a 
distal site. The single-base mismatch prevents amplification and no detectable 
product is formed The method works best when the mismatch is included in the 3'- 

20 most position of the oligonucleotide aligned with the polymorphism because this 
position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456). 

4. Direct-Sequencing 

The direct analysis of the sequence of polymorphisms of the present invention 
can be accomplished using either the dideoxy chain termination method or the Maxam 
25 Gilbert method (see Sambrook et al. y Molecular Cloning, A Laboratory Manual (2nd 
Ed., CSHP, New York 1989); Zyskind et al y Recombinant DNA Laboratory Manual 
(Acad. Press, 1988)). 

5. Denaturing Gradient Gel Electrophoresis 

Amplification products generated using the polymerase chain reaction can be 
30 analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can 
be identified based on the different sequence-dependent melting properties and 
electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, 
Principles and Applications for DNA Amplification, (W.H. Freeman and Co, New 
York, 1992), Chapter 7. 
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6. Single-Strand Conformation Polymorphism Analysis 
Alleles of target sequences can be differentiated using single-strand 
conformation polymorphism analysis, which identifies base differences by alteration 
in electrophoretic migration of single stranded PCR products, as described in Orita et 
5 aU Proc. Nat. Acad. Sci. 86, 2766-2770 (1989). Amplified PCR products can be 
generated as described above, and heated or otherwise denatured, to form single 
stranded amplification products. Single-stranded nucleic acids may refold or form 
secondary structures which are partially dependent on the base sequence. The 
different electrophoretic mobilities of single-stranded amplification products can be 
10 related to base-sequence differences between alleles of target sequences. 

Ill, Methods of Use 

After determining polymorphic form(s) present in an individual at one or more 
polymorphic sites, this information can be used in a number of methods. 

A. Forensics 

1 5 Determination of which polymorphic forms occupy a set of polymorphic sites 

in an individual identifies a set of polymorphic forms that distinguishes the individual. 
See generally National Research Council, The Evaluation of Forensic DNA Evidence 
(Eds. Pollard et al, National Academy Press, DC, 1996). The more sites that are 
analyzed, the lower the probability that the set of polymorphic forms in one individual 

20 is the same as that in an unrelated individual. Preferably, if multiple sites are 

analyzed, the sites are unlinked. Thus, polymorphisms of the invention are often used 
in conjunction with polymorphisms in distal genes. Preferred polymorphisms for use 
in forensics are biallelic because the population frequencies of two polymorphic forms 
can usually be determined with greater accuracy than those of multiple polymorphic 

25 forms at multi-allelic loci. 

The capacity to identify a distinguishing or unique set of forensic markers in 
an individual is useful for forensic analysis. For example, one can determine whether 
a blood sample from a suspect matches a blood or other tissue sample from a crime 
scene by determining whether the set of polymorphic forms occupying selected 

30 polymorphic sites is the same in the suspect and the sample. If the set of polymorphic 
markers does not match between a suspect and a sample, it can be concluded (barring 
experimental error) that the suspect was not the source of the sample. If the set of 
markers does match, one can conclude that the DNA from the suspect is consistent 
with that found at the crime scene. If frequencies of the polymorphic forms at the loci 

35 tested have been determined (e.g., by analysis of a suitable population of individuals), 
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one can perform a statistical analysis to determine the probability that a match of 
suspect and crime scene sample would occur by chance. 

p(ID) is the probability that two random individuals have the same 
polymorphic or allelic form at a given polymorphic site. In biallelic loci, four 
5 genotypes are possible: AA, AB, BA, and BB. If alleles A and B occur in a haploid 
genome of the organism with frequencies x and y, the probability of each genotype in 
a diploid organism is (see WO 95/12607): 
Homozygote: p(AA)= x 2 
Homozygote: p(BB)= y 2 = (1-x) 2 
1 0 Single Heterozygote: p(AB)= p(BA)= xy = x(l-x) 

Both Heterozygotes: p(AB+BA)= 2xy = 2x(l-x) 

The probability of identity at one locus (i.e, the probability that two 
individuals, picked at random from a population will have identical polymorphic 
forms at a given locus) is given by the equation: 
15 p(ID) = (x 2 ) 2 + (2xy) 2 + (y 2 ) 2 . 

These calculations can be extended for any number of polymorphic forms at a 
given locus. For example, the probability of identity p(E>) for a 3-allele system 
where the alleles have the frequencies in the population of x, y and z, respectively, is 
equal to the sum of the squares of the genotype frequencies: 
20 p(ID) = x 4 + (2xy) 2 + (2yz) 2 + (2xz) 2 + z 4 + y 4 

In a locus of n alleles, the appropriate binomial expansion is used to calculate 
p(ID) and p(exc). 

The cumulative probability of identity (cum p(ID)) for each of multiple 
unlinked loci is determined by multiplying the probabilities provided by each locus. 
25 cum p(ID) «■ p(BDl)p(ID2)p(ID3).... p(IDn) 

The cumulative probability of non-identity for n loci (i.e. the probability that 
two random individuals will be different at 1 or more loci) is given by the equation: 
cum p(nonlD) = 1-cum p(ED). 

If several polymorphic loci are tested, the cumulative probability of non- 
30 identity for random individuals becomes very high (e.g., one billion to one). Such 
probabilities can be taken into account together with other evidence in determining 
the guilt or innocence of the suspect. 
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B. Paternity Testing 

The object of paternity testing is usually to determine whether a male is the 
father of a child. In most cases, the mother of the child is known and thus, the 
mother's contribution to the child's genotype can be traced. Paternity testing 
5 investigates whether the part of the child's genotype not attributable to the mother is 
consistent with that of the putative father. Paternity testing can be performed by 
analyzing sets of polymorphisms in the putative father and the child. 

If the set of polymorphisms in the child attributable to the father does not 
match the set of polymorphisms of the putative father, it can be concluded, barring 
10 experimental error, that the putative father is not the real father. If the set of 
polymorphisms in the child attributable to the father does match the set of 
polymorphisms of the putative father, a statistical calculation can be performed to 
determine the probability of coincidental match. 

The probability of parentage exclusion (representing the probability that a 
15 random male will have a polymorphic form at a given polymorphic site that makes 
him incompatible as the father) is given by the equation (see WO 95/12607): 

p(exc) = xy(l-xy) 

where x and y are the population frequencies of alleles A and B of a biallelic 
polymorphic site. 

20 (At a triallelic site p(exc) = xy(l-xy) + yz(l- yz) + xz(l-xz)+ 3xyz(l-xyz))), 

where x, y and z and the respective population frequencies of alleles A, B and C). 
The probability of non-exclusion is 
p(non-exc) = l-p(exc) 

The cumulative probability of non-exclusion (representing the value obtained 
25 when n loci are used) is thus: 

cum p(non-exc) = p(non-excl)p(non-exc2)p(non-exc3).... p(non-excn) 

The cumulative probability of exclusion for n loci (representing the probability 
that a random male will be excluded) 

cum p(exc) = 1 - cum p(non-exc). 
30 If several polymorphic loci are included in the analysis, the cumulative 

probability of exclusion of a random male is very high. This probability can be taken 
into account in assessing the liability of a putative father whose polymorphic marker 
set matches the child's polymorphic marker set attributable to his/her father. 

C. Correlation of Polymorphisms with Phenotypic Traits 

35 The polymorphisms of the invention may contribute to the phenotype of an 

organism in different ways. Some polymorphisms occur within a protein coding 
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sequence and contribute to phenotype by affecting protein structure. The effect may 
be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on 
the circumstances. For example, a heterozygous sickle cell mutation confers 
resistance to malaria, but a homozygous sickle cell mutation is usually lethal. Other 
5 polymorphisms occur in noncoding regions but may exert phenotypic effects 
indirectly via influence on replication, transcription, and translation. A single 
polymorphism may affect more than one phenotypic trait. Likewise, a single 
phenotypic trait may be affected by polymorphisms in different genes. Further, some 
polymorphisms predispose an individual to a distinct mutation that is causally related 

10 to a certain phenotype. 

Phenotypic traits include diseases that have known but hitherto unmapped 
genetic components (e.g., agammaglobulimenia, diabetes insipidus, Lesch-Nyhan 
syndrome, muscular dystrophy, Wiskott-Aldrich syndrome, Fabry's disease, familial 
hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von 

1 5 Willebrand's disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, 
familial colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and 
acute intermittent porphyria). Phenotypic traits also include symptoms of, or 
susceptibility to, multifactorial diseases of which a component is or may be genetic, 
such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, 

20 and infection by pathogenic microorganisms. Some examples of autoimmune 

diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent 
and non-independent), systemic lupus erythematosus and Graves disease. Some 
examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, 
kidney, leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and 

25 uterus. Phenotypic traits also include characteristics such as longevity, appearance 
(e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or 
receptivity to particular drugs or therapeutic treatments. 

The correlation of one or more polymorphisms with phenotypic traits can be 
facilitated by knowledge of the gene product of the wild type (reference) gene. The 

30 genes in which cSNPs of the present invention have been identified are genes which 
have been previously sequenced and characterized in one of their allelic forms. For 
example, genes of the present invention in which cSNPs have been identified include 
genes encoding antithrombin III (Humphries, Semin Hematol 52:8-16 (1995); 
Mammen, Semin Hematol 32:2-6 (1995)), cholesterol ester transfer protein (Bruce 

35 and Tall, Curr Opin Lipidol 6:306-3 1 1 (1 995)), CLanalog (HDL/scavenger receptor) 
(Freeman, Curr Opin Hematol 4:41-47 (1997); Knecht and Glass, Adv Genet 52:141- 
198 (1995); Rigotti et aL, Curr Opin Lipidol 5:181-188 (1997)), thrombin receptor 
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(Brass and Molino, Thromb Haemost 75:234-241 (1997); Jamiesoii, Thromb Haemost 
75:242-246 (1997)), thrombin (Eisenberg, Coron Artery Dis 7:400-408 (1996); 
Jamieson, Thromb Haemost 75:242-246 (1997)), and heparin cefaclor II (Bick and 
Pegram, Semin Thromb Hemost 20:109-132 (1994)). Also included are the genes 
5 encoding HMG coA-reductase (Bjelajac et al , Ann Pharmacother 30: 1 304- 1 3 1 5 

(1996) ), platelet glycoprotein IIB and IIIA (Jamieson, Thromb Haemost 75:242-246 

(1997) ; Lefkovits et al, N Engl J Med 552:1553-1559 (1995); Nurden, Thromb 
Haemost 74:345-351 (1995)), lecithinxholesterol acyltransferase (Kuivenhoven etal, 
J Lipid Res 55:191-205 (1997)), LDL receptor (Holvoet and Collen, Curr Opin 

10 Lipidol 5:320-328 (1997); Rigotti et al, Curr Opin Lipidol 5:181-188 (1997)), 
protein C (Bertina, Clin Chem 45:1678-1683 (1997); Bick and Pegram, Semin 
Thromb Hemost 20:109-132 (1994); Humphries, Semin Hematol 52:8-16 (1995); 
Koeleman et al, Semin Hematol 54:256-264 (1997)), platelet activating factor 
receptor (Feuerstein et al, JLipidMediat Cell Signal 75:255-284 (1997); Shimizu 

1 5 and Mutoh, Adv Exp Med Biol 407: 1 97-204 ( 1 997)), tissue factor (Abildgaard, Blood 
Coagul Fibrinolysis 6:S45-49(1995); Bick and Pegram, Semin Thromb Hemost 
20:109-132 (1994); Harker et al 9 Haemostasis 7:76-82 (1996); Ruf and Edgington, 
Faseb 75:385-390 (1994)), tissue factor pathway inhibitor (Shimizu and Mutoh, Adv 
Exp Med Biol 407:197-204 (1997); Feuerstein et al, JLipidMediat Cell Signal 

20 75:255-284 (1997)), thromboxane A2 receptor (Feuerstein et al, JLipidMediat Cell 
Signal 75:255-284 (1997); Kinsella etal,Ann NYAcadSci 774:270-278 (1994); 
Patrono and Renda, Am J Cardiol 50.17E-20E (1997)), lipoprotein lipase 
(Applebaum-Bowden, Curr Opin Lipidol 6:130-135 (1995)), and factor V (Bertina, 
Clin Chem 45:1678-1683 (1997); Harker et al, Haemostasis 7:76-82 (1996); 

25 Koeleman et al, Semin Hematol 54:256-264 (1997)). 

Correlation is performed for a population of individuals who have been tested 
for the presence or absence of a phenotypic trait of interest and for polymorphic 
markers sets. To perform such analysis, the presence or absence of a set of 
polymorphisms (i.e. a polymorphic set) is determined for a set of the individuals, 

30 some of whom exhibit a particular trait, and some of which exhibit lack of the trait. 
The alleles of each polymorphism of the set are then reviewed to determine whether 
the presence or absence of a particular allele is associated with the trait of interest. 
Correlation can be performed by standard statistical methods such as a K-squared test 
and statistically significant correlations between polymorphic form(s) and phenotypic 

35 characteristics are noted. For example, it might be found that the presence of allele 
Al at polymorphism A correlates with heart disease. As a further example, it might 
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be found that the combined presence of allele Al at polymorphism A and allele Bl at 
polymorphism B correlates with increased milk production of a farm animal. 

Such correlations can be exploited in several ways. In the case of a strong 
correlation between a set of one or more polymorphic forms and a disease for which 
5 treatment is available, detection of the polymorphic form set in a human or animal 
patient may justify immediate administration of treatment, or at least the institution of 
regular monitoring of the patient. Detection of a polymorphic form correlated with 
serious disease in a couple contemplating a family may also be valuable to the couple 
in their reproductive decisions. For example, the female partner might elect to 

10 undergo in vitro fertilization to avoid the possibility of transmitting such a 

polymorphism from her husband to her offspring. In the case of a weaker, but still 
statistically significant correlation between a polymorphic set and human disease, 
immediate therapeutic intervention or monitoring may not be justified. Nevertheless, 
the patient can be motivated to begin simple life-style changes (e.g., diet, exercise) 

15 that can be accomplished at little cost to the patient but confer potential benefits in 
reducing the risk of conditions to which the patient may have increased susceptibility 
by virtue of variant alleles. Identification of a polymorphic set in a patient correlated 
with enhanced receptiveness to one of several treatment regimes for a disease 
indicates that this treatment regime should be followed. 

20 For animals and plants, correlations between characteristics and phenotype.are 

useful for breeding for desired characteristics. For example, Beitz et al, US 
5,292,639 discuss use of bovine mitochondrial polymorphisms in a breeding program 
to improve milk production in cows. To evaluate the effect of mtDNA D-loop 
sequence polymorphism on milk production, each cow was assigned a value of 1 if 

25 variant or 0 if wildtype with respect to a prototypical mitochondrial DNA sequence at 
each of 17 locations considered. Each production trait was analyzed individually with 
the following animal model: 

Y ijkpn =^ + YS i + P j + X k + p i + ...p 17 + PE n + a n -fe p 
where Y ijtolp is the milk, fat, fat percentage, SNF, SNF percentage, energy 

30 concentration, or lactation energy record; pi is an overall mean; YS,. is the effect 

common to all cows calving in year-season; X k is the effect common to cows in either 
the high or average selection line; P, to P l7 are the binomial regressions of production 
record on mtDNA D-loop sequence polymorphisms; PE n is permanent environmental 
effect common to all records of cow n; a„ is effect of animal n and is composed of the 

35 additive genetic contribution of sire and dam breeding values and a Mendelian 
sampling effect; and e p is a random residual. It was found that eleven of seventeen 
polymorphisms tested influenced at least one production trait. Bovines having the 
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best polymorphic forms for milk production at these eleven loci are used as parents 
for breeding the next generation of the herd. 

D. Genetic Mapping of Phenotypic Traits 

The previous section concerns identifying correlations between phenotypic 
5 traits and polymorphisms that directly or indirectly contribute to those traits. The 
present section describes identification of a physical linkage between a genetic locus 
associated with a trait of interest and polymorphic markers that are not associated with 
the trait, but are in physical proximity with the genetic locus responsible for the trait 
and co-segregate with it. Such analysis is useful for mapping a genetic locus 

10 associated with a phenotypic trait to a chromosomal position, and thereby cloning 
gene(s) responsible for the trait. See Lander et aL, Proa Natl Acad, Sci. (USA) 83, 
7353-7357 (1986); Lander et aL, Proa Natl Acad. Set (USA) 84, 2363-2367 (1987); 
Donis-Keller et aL, Cell 51, 319-337 (1987); Lander et aL, Genetics 121, 185-199 
(1 989)). Genes localized by linkage can be cloned by a process known as directional 

15 cloning. See Wainwright, Med. J. Australia 159, 170-174 (1993); Collins, Nature 
Genetics 1,3-6(1992). 

Linkage studies are typically performed on members of a family. Available 
members of the family are characterized for the presence or absence of a phenotypic 
trait and for a set of polymorphic markers. The distribution of polymorphic markers 

20 in an informative meiosis is then analyzed to determine which polymorphic markers 
co-segregate with a phenotypic trait. See, e.g., Kerem et aL, Science 245, 1073-1080 
(1989); Monaco et aL, Nature 316, 842 (1985); Yamoka et aL, Neurology 40, 222- 
226 (1990); Rossiter et aL, FASEB Journal 5, 21-27 (1991). 

Linkage is analyzed by calculation of LOD (log of the odds) values. A lod 

25 value is the relative likelihood of obtaining observed segregation data for a marker 
and a genetic locus when the two are located at a recombination fraction 8, versus the 
situation in which the two are not linked, and thus segregating independently 
(Thompson & Thompson, Genetics in Medicine (5th ed, W.B. Saunders Company, 
Philadelphia, 1991); Strachan, "Mapping the human genome" in The Human Genome 

30 (BIOS Scientific Publishers Ltd, Oxford), Chapter 4). A series of likelihood ratios are 
calculated at various recombination fractions (8), ranging from 8 = 0.0 (coincident 
loci) to 8 = 0.50 (unlinked). Thus,thehkelihoodatagivenvalueof8is: probability 
of data if loci linked at 6 to probability of data if loci unlinked. The computed 
likelihoods are usually expressed as the log 10 of this ratio (i.e., a lod score). For 

35 example, a lod score of 3 indicates 1000:1 odds against an apparent observed linkage 
being a coincidence. The use of logarithms allows data collected from different 
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families to be combined by simple addition. Computer programs are available for the 
calculation of lod scores for differing values of 6 (e.g., LIPED, MLINK (Lathrop, 
Proc. Nat. Acad. Sci. (USA) 81, 3443-3446 (1984)). For any particular lod score, a 
recombination fraction may be determined from mathematical tables. See Smith et 
5 a/., Mathematical tables for research workers in human genetics (Churchill, London, 
1961); Smith, Ann. Hum. Genet. 32, 127-150 (1968). The value of 0 at which the lod 
score is the highest is considered to be the best estimate of the recombination fraction. 

Positive lod score values suggest that the two loci are linked, whereas negative 
values suggest that linkage is less likely (at that value of 8) than the possibility that 

10 the two loci are unlinked. By convention, a combined lod score of +3 or greater 
(equivalent to greater than 1000:1 odds in favor of linkage) is considered definitive 
evidence that two loci are linked. Similarly, by convention, a negative lod score of -2 
or less is taken as definitive evidence against linkage of the two loci being compared. 
Negative linkage data are useful in excluding a chromosome or a segment thereof 

1 5 from consideration. The search focuses on the remaining non-excluded chromosomal 
locations. 

IV. Modified Polypeptides and Gene Sequences 

The invention further provides variant forms of nucleic acids and 
corresponding proteins. The nucleic acids comprise one of the sequences described in 

20 the Table, column 8, in which the polymorphic position is occupied by one of the 
alternative bases for that position. Some nucleic acids encode full-length variant 
forms of proteins. Similarly, variant proteins have the prototypical amino acid 
sequences encoded by nucleic acid sequences shown in the Table, column 8, (read so 
as to be in-frame with the full-length coding sequence of which it is a component) 

25 except at an amino acid encoded by a codon including one of the polymorphic 
positions shown in the Table. That position is occupied by the amino acid coded by 
the corresponding codon in any of the alternative forms shown in the Table. 

Variant genes can be expressed in an expression vector in which a variant gene 
is operably linked to a native or other promoter. Usually, the promoter is a eukaryotic 

30 promoter for expression in a mammalian cell. The transcription regulation sequences 
typically include a heterologous promoter and optionally an enhancer which is 
recognized by the host. The selection of an appropriate promoter, for example trp, 
lac, phage promoters, glycolytic enzyme promoters and tRNA promoters, depends on 
the host selected. Commercially available expression vectors can be used, Vectors 

35 can include host-recognized replication systems, amplifiable genes, selectable 
markers, host sequences useful for insertion into the host genome, and the like. 
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The means of introducing the expression construct into a host cell varies 
depending upon the particular construction and the target host. Suitable means 
include fusion, conjugation, transfection, transduction, electroporation or injection, as 
described in Sambrook, supra. A wide variety of host cells can be employed for 
5 expression of the variant gene, both prokaryotic and eukaryotic. Suitable host cells 
include bacteria such as E. coli, yeast, filamentous fungi, insect cells, mammalian 
cells, typically immortalized, e.g., mouse, CHO, human and monkey cell lines and 
derivatives thereof Preferred host cells are able to process the variant gene product to 
produce an appropriate mature polypeptide. Processing includes glycosylation, 
10 ubiquitination, disulfide bond formation, general post-translational modification, and 
the like. As used herein, "gene product" includes mRNA, peptide and protein 
products. 

The protein may be isolated by conventional means of protein biochemistry 
and purification to obtain a substantially pure product, i.e, 80, 95 or 99% free of cell 

15 component contaminants, as described in Jacoby, Methods in Enzymology Volume 
104, Academic Press, New York (1984); Scopes, Protein Purification, Principles and 
Practice, 2nd Edition, Springer- Verlag, New York (1987); and Deutscher (ed), Guide 
to Protein Purification, Methods in Enzymology, Vol. 1 82 (1990). If the protein is 
secreted, it can be isolated from the supernatant in which the host cell is grown. If not 

20 secreted, the protein can be isolated from a lysate of the host cells. 

The invention further provides transgenic nonhuman animals capable of 
expressing an exogenous variant gene and/or having one or both alleles of an 
endogenous variant gene inactivated. Expression of an exogenous variant gene is 
usually achieved by operably linking the gene to a promoter and optionally an 

25 enhancer, and microinjecting the construct into a zygote. See Hogan et ai, 
"Manipulating the Mouse Embryo, A Laboratory Manual," Cold Spring Harbor 
Laboratory. Inactivation of endogenous variant genes can be achieved by forming a 
transgene in which a cloned variant gene is inactivated by insertion of a positive 
selection marker. See Capecchi, Science 244, 1288-1292 (1989). The transgene is 

30 then introduced into an embryonic stem cell, where it undergoes homologous 

recombination with an endogenous variant gene. Mice and other rodents are preferred 
animals. Such animals provide useful drug screening systems. 

In addition to substantially full-length polypeptides expressed by variant 
genes, the present invention includes biologically active fragments of the 

35 polypeptides, or analogs thereof, including organic molecules which simulate the 
interactions of the peptides. Biologically active fragments include any portion of the 
full-length polypeptide which confers a biological function on the variant gene 
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product, including ligand binding, and antibody binding. Ligand binding includes 
binding by nucleic acids, proteins or polypeptides, small biologically active 
molecules, or large cellular structures. 

Polyclonal and/or monoclonal antibodies that specifically bind to variant gene 
5 products but not to corresponding prototypical gene products are also provided. 
Antibodies can be made by injecting mice or other animals with the variant gene 
product or synthetic peptide fragments thereof. Monoclonal antibodies are screened 
as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, 
Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, 
10 Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal 
antibodies are tested for specific immunoreactivity with a variant gene product and 
lack of immunoreactivity to the corresponding prototypical gene product. These 
antibodies are useful in diagnostic assays for detection of the variant form, or as an 
active ingredient in a pharmaceutical composition. 

15 V. Kits 

The invention further provides kits comprising at least one allele-specific 
oligonucleotide as described above. Often, the kits contain one or more pairs of 
allele-specific oligonucleotides hybridizing to different forms of a polymorphism. In 
some kits, the allele-specific oligonucleotides are provided immobilized to a substrate. 

20 For example, the same substrate can comprise allele-specific oligonucleotide probes 
for detecting at least 10, 100 or all of the polymorphisms shown in the Table. 
Optional additional components of the kit include, for example, restriction enzymes, 
reverse-transcriptase or polymerase, the substrate nucleoside triphosphates, means 
used to label (for example, an avidin-enzyme conjugate and enzyme substrate and 

25 chromogen if the label is biotin), and the appropriate buffers for reverse transcription, 
PGR, or hybridization reactions. Usually, the kit also contains instructions for 
carrying out the methods. 

The following Examples are offered for the purpose of illustrating the present 
invention and are not to be construed to limit the scope of this invention. The 

30 teachings of all references cited herein are hereby incorporated herein by reference. 

EXAMPLES 

The polymorphisms shown in the Table were identified by resequencing of 
target sequences from a minimum of 50 unrelated individuals of diverse ethnic and 
geographic backgrounds by hybridization to probes immobilized to microfabricated 
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arrays. The strategy and principles for design and use of such arrays are generally 
described in WO 95/1 1995. 

A typical probe array used in this analysis has two groups of four sets of 
probes that respectively tile both strands of a reference sequence. A first probe set 
5 comprises a plurality of probes exhibiting perfect complementarily with one of the 
reference sequences. Each probe in the first probe set has an interrogation position 
that corresponds to a nucleotide in the reference sequence. That is, the interrogation 
position is aligned with the corresponding nucleotide in the reference sequence, when 
the probe and reference sequence are aligned to maximize complementarily between 

10 the two. For each probe in the first set, there are three corresponding probes from 
three additional probe sets. Thus, there are four probes corresponding to each 
nucleotide in the reference sequence. The probes from the three additional probe sets 
are identical to the corresponding probe from the first probe set except at the 
interrogation position, which occurs in the same position in each of the four 

1 5 corresponding probes from the four probe sets, and is occupied by a different 

nucleotide in the four probe sets. In the present analysis, probes were 25 nucleotides 
long. Arrays tiled for multiple different references sequences were included on the 
same substrate. 

Publicly available sequences for a given gene were assembled into Gap4 
20 (http://www.biozentrum.unibas.ch/~biocomp/staden/Overview.html). PCR primers 
covering each exon were designed using Primer 3 (http://www- 
genome.wi.mit.edu/cgi-bin/primer/primer3.cgi). Primers were not designed in regions 
where there were sequence discrepancies between reads. For CLA1 , whose genomic 
sequence is not published, nested primers were designed from the cDNA. For all 
25 genes except CLA1, genomic DNA was amplified in at least 50 individuals using 2.5 
pmol each primer, 1.5 mM MgCl 2 , 100 jtM dNTPs, 0.75 nM AmpliTaq GOLD 
polymerase, and 19 ng DNA in a 15 jil reaction. Reactions were assembled using a 
PACKARD MultiPROBE robotic pipetting station and then put in MJ 96-well tetrad 
thermocyclers (96°C for 10 minutes, followed by 35 cycles of 96°C for 30 seconds, 
30 59 °C for 2 minutes, and 72 °C for 2 minutes). A subset of the PCR assays for each 
individual were run on 3% NuSieve gels in 0.5X TBE to confirm that the reaction 
worked. 

For CLA1, first strand cDNA was made using the Gibco BRL Superscript 
Preamplification Kit (#18089-01 1) and following the manufacturers instructions 
35 except that 150 ng of random hexamers were used to primer 1 jig of total RNA. The 
cDNA was amplified using the outermost primer pairs and the above conditions; 1/20 
of the reaction was used as a template for the secondary PCR using the innermost 
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primers. All RT-PCR products were run on 2% NuSieve gels in IX TAE to confirm 
the presence of a product. 

For a given DNA, 5 \x\ (about 50ng) of each PCR or RT-PCR product were 
pooled (Final volume = 150-200 |il). The products were purified using QiaQuick 
5 PCR purification from Qiagen. The samples were eluted once in 35 \i\ sterile water 
and 4 ^il 10X One-Phor-All buffer (Pharmacia). The pooled samples were digested 
with 0.2 n DNasel (Promega)for 10 minutes at 37 °C and then labeled with 0.5 nmols 
biotin-N6-ddATP and 15 \i Terminal Transferase (GibcoBRL Life Technology) for 60 
minutes at 37 °C. Both fragmentation and labeling reactions were terminated by 

10 incubating the pooled sample for 15 minutes at 100°C. 

Low-density DNA chips (Affymetrix,CA) were hybridized following the 
manufacturers instructions. Briefly, the hybridization cocktail consisted of 3M 
TMAC1, 10 mM Tris pH 7.8, 0.01% Triton X-100, 100 mg/ml herring sperm DNA 
(Gibco BRL), 200 pM control biotin-labeled oligo. The processed PCR products 

15 were denatured for 7 minutes at 100°C and then added to prewarmed (37°C) 
hybridization solution. The chips were hybridized overnight at 44 °C. Chips were 
washed in IX SSPET and 6X SSPET followed by staining with 2 ng/ml SARPE and 
0.5 mg/ml acetylated BSA in 200 nl of 6X SSPET for 8 minutes at room temperature. 
Chips were scanned using a Molecular Dynamics scanner. 

20 Chip image files were analyzed using Ulysses (Asymetrix, CA) which uses 

four algorithms to identify potential polymorphisms. Candidate polymorphisms were 
visually inspected and assigned a confidence value: high confidence candidates 
displayed all three genotypes, while likely candidates showed only two genotypes 
(homozygous for reference sequence and heterozygous for reference and variant). 

25 Some of the candidate polymoprhisms were confirmed by ABI sequencing. Identified 
polymorphisms were compared to SwissProt and the Mutation Database to determine 
if they were novel. Results are shown in the Table. 

In the Table, the genes listed in column 2 are as follows: antithrombin III 
(AT3); cholesterol ester transfer protein (CETP); CLanalog (HDL/scavenger receptor) 

30 (CLanalog); thrombin receptor (F2R); thrombin (F2); heparin Cofactor II (HCF2); 
HMG coA-reductase (HMGCR); platelet glycoprotein HB (ITGA2B); platelet 
glycoprotein mA (ITGB3); lecithinxholesterol acyltransferase (LCAT); LDL 
receptor (LDLR); protein C (PROC); platelet activating factor receptor (PTAFR); 
tissue factor pathway inhibitor (TFPI); thromboxane A2 receptor (TBXA2R); 

35 lipoprotein lipase (LPL); tissue factor (F3); and factor V (F5). 

Column 1 of the Table shows the laboratory name for the particular gene. 
Column 3 shows the GenBank Accession number for the wild type (reference) allele. 
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Column 4 shows the nucleotide number location of the polymorphism relative to the 
numbering of the sequence deposited with GenBank having the listed Accession 
number; the GenBank sequence is understood to be the nucleotide sequence present in 
the GenBank database on April 1, 1998, which sequences are incorporated herein by 
5 reference in their entirety. These GenBank sequences are illustrated in Figures 3-38. 
Column 5 shows the codon which is altered by the polymorphism. Columns 
6, 7 and 8 show the reference codon, variant codon and amino acid change, 
respectively, for the silent polymorphisms. Columns 9, 10 and 1 1 show the reference 
codon, variant codon and amino acid change, respectively, for the missense 

10 polymorphisms. Columns 12, 13 and 14 show the reference codon, variant codon and 
amino acid change, respectively, for the nonsense polymorphisms. Columns 15 and 
16 show the nucleotide of the reference allele and the frequency of that allele, 
respectively. This base is arbitrarily designated the reference or prototypical form, 
but it is not necessarily the most frequently occurring form. Columns 1 7 and 1 8 show 

15 the nucleotide of the variant allele and the frequency of that allele, respectively. It is 
noted that the genes with polymorphism IDs of F5u8, HCF2ul and HMGCRu2 
contained the indicated polymorphism at the indicated nucleotide position, but that 
these nucleotide positions are in the non-coding region of the gene. 
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Genotyping and genetic association studies were performed with respect to the 
allelic forms of the F5U4 and HCF2U4 genes, and the presence of the reference and 
variant alleles (as shown in Table 1) were correlated with the occurrence of venous 
thrombosis and pulmonary emboli. The results are shown in Tables 2 and 3. 

5 TABLE 2: HCF2U4 GENETIC ASSOCIATION STUDY 





Case 


Control 


Reference 


115 


115 


Heterozygote 


5 


0 



(p = 0.027 by Chi-square test) 

(p = 0.06 by Fisher's exact test (two-tailed)). 

10 The F5u4 variant leads to an amino acid substitution (Met41 3Thr) in the 

coagulation factor V gene. Another common variant in Factor V (Arg506Gln), the 
Leiden Variant, is the most common genetic factor predisposing to thrombosis that 
has been identified to date. Genotyping of patients with deep venous thrombosis has 
confirmed a statistical association of this variant with deep venous 

15 thrombosis/pulmonary embolism in two separate populations of patients, as shown 
below: 

TABLE 3: F5U4 GENETIC ASSOCIATION STUDY 





REF 


HET 


VAR 


TOTAL 


AT.T,ET.F,FREQ 




REF 


VAR 


Case 


226 


38 


5 


269 


91% 


9% 


Control 


207 


28 


0 


235 


94% 


6% 


2nd Population 


Case 


85 


28 


2 


115 


86% 


14% 


Control 


95 


14 


4 


113 


90% 


10% 



(p <0.05 by Chi-square test for combined populations) 
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These data indicate that there is a trend toward an association between the 
presence of the variant allele (or heterozygousity) and the occurence of venous 
thrombosis and/or pulmonary emboli. 

From the foregoing, it is apparent that the invention includes a number of 
5 general uses that can be expressed concisely as follows. The invention provides for 
the use of any of the nucleic acid segments described above in the diagnosis or 
monitoring of diseases, such as cancer, inflammation, heart disease, diseases of the 
cardiovascular system, and infection by microorganisms. The invention further 
provides for the use of any of the nucleic acid segments in the manufacture of a 
10 medicament for the treatment or prophylaxis of such diseases. The invention further 
provides for the use of any of the DNA segments as a pharmaceutical. 

All references cited above are incorporated by reference in their entirety for all 
purposes to the same extent as if each individual publication or patent application 
were specifically and individually indicated to be so incorporated by reference. 



1 5 While this invention has been particularly shown and described with 

references to preferred embodiments thereof, it will be understood by those skilled 
in the art that various changes in form and details may be made therein without 
departing from the spirit and scope of the invention as defined by the appended 
claims. 
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CLAIMS 

WE CLAIM: 

1. A nucleic acid molecule selected from the group consisting of the genes listed 
in the Table, wherein said nucleic acid molecule is at least 5 nucleotides in 

5 length and comprises a polymorphic site identified in the Table, wherein a 

nucleotide at the polymorphic site is different from a nucleotide at the 
polymorphic site in a corresponding reference allele. 

2. A nucleic acid molecule according to Claim 1, wherein said nucleic acid 
molecule is at least 10 nucleotides in length. 

10 3. A nucleic acid molecule according to Claim 1 , wherein said nucleic acid 
molecule is at least 20 nucleotides in length. 

4. A nucleic acid molecule according to Claim 1, wherein the nucleotide at the 
polymorphic site is the variant nucleotide for the gene listed in the Table. 

5. An allele-specific oligonucleotide that hybridizes to a portion of a gene 

15 selected from the group consisting of the genes listed in the Table, wherein 

said portion is at least 5 nucleotides in length and comprises a polymorphic 
site identified in the Table, wherein a nucleotide at the polymorphic site is 
different from a nucleotide at the polymorphic site in a corresponding 
reference allele. 

20 6. An allele-specific oligonucleotide according to Claim 5 that is a probe. 

7. An allele-specific oligonucleotide according to Claim 5, wherein a central 
position of the probe aligns with the polymorphic site of the portion. 

8. An allele-specific oligonucleotide according to Claim 5 that is a primer. 

9. An allele-specific oligonucleotide according to Claim 8, wherein the 3 f end of 
25 the primer aligns with the polymorphic site of the portion. 
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10. An isolated gene product encoded by a nucleic acid molecule according to 
Claim 1. 

11. A method of analyzing a nucleic acid sample, comprising obtaining the 
nucleic acid from an individual sample; and determining a base occupying any 

5 one of the polymorphic sites shown in the Table. 

12. A method according to Claim 1 1, wherein the nucleic acid sample is obtained 
from a plurality of individuals, and a base occupying one of the polymorphic 
positions is determined in each of the individuals, and the method fiirther 
comprising testing each individual for the presence of a disease phenotype, 

1 0 and correlating the presence of the disease phenotype with the base. 
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D29832 : 1005 


AT3u2 


D29832 : 1035 
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M21645 : 100 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



HUMPAFRE 1780 bp mRNA PRI 10-OCT-1992 

Human mRNA for platelet-activating factor receptor, complete cds. 
D10202 D90433 
g219975 

G-protein coupled receptor; PAF receptor; platelet-activating 
factor receptor. 
Human leukocytes cDNA to mRNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; 
Homo. 

1 (bases 1 to 1780) 

Nakamura,M., Honda, Z., Izumi,T., Sakanaka,C, Mutoh,H., MInami,M., 
Bito,H., Seyama,Y., Noma,M., Mtsumoto,T. and Shimizu, T. 
Molecular cloning and expression of platelet-activating factor 
receptor from human leukocytes 
J. Biol. Chem. 266 (30), 20400-20405 (1991) 
92041873 

2 (bases 1 to 1780) 
Shimizu,T. 
Direct Submission 

Submitted (28-JUN-1991 } to the DDBJ/EMBL/ GenBank databases. Takao 
Shimizu, Faculty of Medicine, University of Tokyo, DeDartment of 
Biochemistry; 7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan 
(Tel:03-3812-2111(ex.3448), Fax:03-3813-8732) 
Submitted (28-Jun-1991) to DDBJ by: 
Takao Shimizu 
Department of Biochemistry 
Faculty of Medicine, University of Tokyo 
7-3-1 Hongo, Bunkyo-ku 
Tokyo 113 
Japan 

Phone: 03-3812-2111 x3448 
Fax: 03-3813-8732. 

Location/Qualifiers 
1. .1780 

/organism="Homo sapiens" 
/db_xref="taxon:9606" 
/cell_type= "leukocytes " 
113. .1141 
/codon_start=l 

/product= "platelet-activating factor receptor" 
/db_xref = " PID: dl001519 " 
/db_xref = " PID : g219976 ■ 

/ trans 1 a t i on= ■ MEPHDS SHMDSEFRYTLFPIVYS 1 1 FVLGVI ANGYVLWVFARLY 

PCKKFNE I K I FMVNLTMADMLFLI TLPLWIVYYQNQGNWILPKFLCNVAGCLFF INTY 

CSVAFLGVITYNRFQAVTRPI KTAQANTRKRG I SLSLVIWVAI VGAASYFLI LDSTNT 

VPDSAGSGNVTRCFEHYEKGSVPVLI IHI FIVFS FFLVFLI ILFCNLVI I RTLLMQPV 

QQQRNAEVKRRALV^CTVIAWIICFVPHHW^ 

VTLCLLSTNCVT^DPVIYCFLTKKFRKHLTEKFYSMRSSRKCSRATTDTV^ 

IPGNSLKN" 



FEATURES 

source 



CDS 



FIG. 3 A 
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BASE COUNT 393 a 533 c 438 g 416 t 
ORIGIN 

1 ttcacgaggg ctggggccag gacccagaca gagacacacg gtcactgcag ctgaagccgc 

61 tgcccctgct acaggcacca ccaggaccag ctgatcattc cagcccacag caatggagcc 

121 acatgactcc tcccacatgg actctgagtt ccgatacact ctcttcccga ttgtttacag 

181 catcatcttt gtgctcgggg tcattgctaa tggctacgtg ctgtgggtct ttgcccgcct 

241 gtacccttgc aagaaattca atgagataaa gatcttcatg gtgaacctca ccatggcgga 

301 catgctcttc ttgatcaccc tgccactttg gattgtctac taccaaaacc agggcaactg 

361 gatactcccc aaattcctgt gcaacgtggc tggctgcctt ttcttcatca acacctactg 

421 ctctgtggcc ttcctgggcg tcatcactta taaccgcttc caggcagtaa ctcggcccat 

481 caagactgct caggccaaca cccgcaagcg tggcatctct ttgtccttgg tcatctgggt 

541 ggccattgtg ggagctgcat cctacttcct catcctggac tccaccaaca cagtgcccga 

601 cagtgctggc tcaggcaacg tcactcgctg ctttgagcat tacgagaagg gcagcgtgcc 

661 agtcctcatc atccacatct tcatcgtgtt cagcttcttc ctggtcttcc tcatcatcct 

721 cttctgcaac ctggtcatca tccgtacctt gctcatgcag ccggtgcagc agcagcgcaa 

781 cgctgaagtc aagcgccggg cgctgtggat ggtgtgcacg gtcttggcgg tgttcatcat 

841 ctgcctcgtg ccccaccacg tggtgcagct gccctggacc cttgctgagc tgggcttcca 

901 ggacagcaaa ttccaccagg ccattaatga tgcacatcag gtcaccctct gcctccttag 

961 caccaactgt gtcttagacc ctgttatcta ctgtttcctc accaagaagt tccgcaagca 

1021 cctcaccgaa aagttctaca gcatgcgcag tagccggaaa tgctcccggg ccaccacgga 

1081 tacggtcact gaagtggttg tgccattcaa ccagatccct ggcaattccc tcaaaaatta 

1141 gtccctgctt ccaggcctga agtcttctcc tccatgaaac atcatgactg agctggggga 

1201 agaagggata tctactgtgg gtctgggcac cacctctgtg gcactggtgg gccattagat 

1261 ttggaggcta cctcacctgg gcagggatga tgcagagcca ggctgttgga aaatccagaa 

1321 ctcaaatgag ccccttcatc cgcctgtggg cgcatactac agtaactgtg actgatgact 

1381 ttatcctgag tcccttaatc ttatggggcc ggaaggaatg tcagggccag gtgcagacct 

1441 tgggggaaga ctttaaacca cctagttctc ccactggggc atcggtctaa agctttgggg 

1501 gagtggcccc agtggctcac acctgtaatc ccagcacttt gggaggccga ggtgggcaga 

1561 tcatgggtca agagatcgag acatcctggc caacattgta aaaccccatc tctactaaaa 

1621 catacaaaaa ttagccgggc atggtgcaca cgcctgtagt cccagctact caggaggctg 

1681 aggcaggaga atcgcttgaa cctgggaggc agaggttgca gtgaacctag attgcaccat 

1741 tgcactctag cctggcaaca gaggcagatt ccctcctgcc 



FIG. 3B 
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HUMATIIIV 1467 bp mRNA PRI 03-SEP-1996 

Human mRNA for antithrombin III variant, complete cds. 
D29832 
g576553 

AT-III; antithrombin III. 

Homo sapiens (individual -isolate AT-III Kyoto) cDNA to mRNA, clone 
pKF16c. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; 
Homo. 

1 (sites) 

Tsu j i , H . , Takada , 0 . , Nakagawa , M . , Tanaka , S . and Hashimoto -Got oh , T . 
Hereditary antithrombin III deficiency: identification of an 
arginine-406 to methionine point mutation near protease reactive 
site 

(in) Yoshida,T.O. and Wilson, J. M. (Eds.); 

MOLECULAR APPROACHES TO THE STUDY AND TREATMENT OF HUMAN DISEASES: 
51-55; 

Elsevier Science (1992) 

2 (bases 1 to 1467) 
Hashimoto-Gotoh, T . 
Unpublished (1994) 

Location/Qualifiers 
1. .1467 

/organism= B Homo sapiens" 
/ db_xr ef « ■ taxon : 9 6 0 6 ■ 
22. .1419 

/note="Wild type AT-III has 'g' instead of ' f at 

1337 nt. Also amino acid residue changes from Met to Arg 
at position 406 aa in wild type AT-III." 
/codon_start=l 

/product=" antithrombin III (AT-III) variant" 

/db_xref="PID:dl006776" 

/db_xref="PID:g576554 rt 

/ transiation= "MYSNVIGTVTSGKRKVYLLSLLLIGFWDCVTCHGSPVDICTAKP 

RDIPMNPMCIYRSPEKKATEDEGSEQKIPEATNNRRVWELSKANSRFATTFYQHLADS 

KNDNDNI FLS PLS ISTAFAMTKLGACNDTLQQLMEVFKFDT I S EKTSDQIHFFFAKLN 

CRLYRKANKSSKLVSANRLFGDKSLTFNETYQDISELVYGAKLQPLDFKENAEQSRAA 

INKWVSNKTEGRI TDVI P SEAINELTVL VXVNTI YFKGLWKSKFS PENTRKELF YKAD 



LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 



REFERENCE 
AUTHORS 
JOURNAL 

FEATURES 

source 



CDS 
position 



GESC SASMMYQEGKFRYRRVAEGTQVLELPFKGDDITMVLI LPKPEKSLAKVEKELTP 

EVLQEWLDELEEMMLVVHMPRFRIEDGFSLKEQLQDMGLVDLFSPEKSKLPGIVAEGR 

DDLYVSDAFHKAFLEvTTCEGSEAAASTAWIAGRSLNPNRVTFKANMPFLVFIREVPL 

NTI I FMGRVANPCVK " 



FIG. 4A 
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BASE COUNT 381 a 375 c 364 g 347 t 

ORIGIN 

1 gaattcgagc tcgccccggc catgtattcc aatgtgatag gaactgtaac ctctggaaaa 
61 aggaaggttt atctcttgtc cttgctgctc attggcttct gggactgcgt gacctgtcac 
121 gggagccctg tggacatctg cacagccaag ccgcgggaca ttcccatgaa tcccatgtgc 
181 atttaccgct ccccggagaa gaaggcaact gaggatgagg gctcagaaca gaagatcccg 
241 gaggccacca acaaccggcg tgtctgggaa ctgtccaagg ccaattcccg ctttgctacc 
301 actttctatc agcacctggc agattccaag aatgacaatg ataacatttt cctgtcaccc 
361 ctgagtatct ctacggcttt tgctatgacc aagctgggtg cctgtaatga caccctccag 
421 caactgatgg aggtatttaa gtttgacacc atatctgaga aaacatctga tcagatccac 
481 ttcttctttg ccaaactgaa ctgccgactc tatcgaaaag ccaacaaatc ctccaagtta 
541 gtatcagcca atcgcctttt tggagacaaa tcccttacct tcaatgagac ctaccaggac 
601 atcagtgagt tggtatatgg agccaagctc cagcccctgg acttcaagga aaatgcagag 
661 caatccagag cggccatcaa caaatgggtg tccaataaga ccgaaggccg aatcaccgat 
721 gtcattccct cggaagccat caatgagctc actgttctgg tgctggttaa caccatttac 
781 ttcaagggcc tgtggaagtc aaagttcagc cctgagaaca caaggaagga actgttctac 
841 aaggctgatg gagagtcgtg ttcagcatct atgatgtacc aggaaggcaa gttccgttat 
901 cggcgcgtgg ctgaaggcac ccaggtgctt gagttgccct tcaaaggtga cgacatcacc 
961 atggtcctca tcttgcccaa gcctgagaag agcctggcca aggtggagaa ggaactcacc 
1021 ccagaggtgc tgcaggagtg gctggatgaa ttggaggaga tgatgctggt ggtccacatg 
1081 ccccgcttcc gcattgagga cggcttcagt tcgaaggagc agctgcaaga catgggcctt 
1141 gtcgatctgt tcagccctga aaagtccaaa ctcccaggta ttgttgcaga aggccgagat 
1201 gacctctatg tctcagatgc attccataag gcatttcttg aggcaaatga agaaggcagt 
1261 gaagcagctg caagtaccgc tgttgtgatt gctggccgtt cgctaaaccc caacagggtg 
1321 actttcaagg ccaacatgcc tttcctggtt tttataagag aagttcctct gaacactatt 
1381 atcttcatgg gcagggtagc caacccttgt gttaagtaaa atgttctcta gaggatcccc 
1441 catcgatggg gtaccgagct cgaattc 



FIG. 4B 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 
SOURCE 
ORGANISM 



2932 bp mRNA 
for thromboxane A2 



receptor , 



PRI 03-APR-1996 
complete cos . 



Yoko ta , Y . , Kageyama , R . , 



HUMHTAR 
Human mRNA 
D38081 
g533325 

thromboxane A2 receptor. 

Homo spaiens placenta cDNA to mRNA, clone HPL. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; 
Homo. 

1 (bases 1 to 2932) 
Hirata,M., Hayashi,Y., Ushikubi,F. , 
Nakanishi,S. and Narumiya,S. 

Cloning and expression of cDNA for a human thromboxane A2 receptor 
Nature 349 (6310), 617-620 (1991) 
91156030 

2 (sites) 

Nusing,R.M., Hirata,M., Kakizuka,A., Eki.T., Ozawa,K. and 
Narumiya , S . 

Characterization and chromosomal mapping of the human thromboxane 
A2 receptor gene 

J. Biol. Chem. 268 (33), 25253-25259 (1993) 
94043399 

3 (bases 1 to 2932) 
Hirata,M. 
Direct Submission 

Submitted (26-AUG-1994) to the DDBJ/EMBL/GenBank databases. 
Masakazu Hirata, Kyoto University Faculty of Medicine, Department 
of Pharmacology; Yoshida, Sakyo-ku, Kyoto, Kyoto 606, Japan 
(Tel: 81-75-753-4392, Fax:81-75-753-4693) 
Location / Qual i f i er s 
1..2932 

/organism=°Homo sapiens" 
/ db_xr e f = " t axon : 9 6 0 6 • 
/ 1 i ssue_type= " pi acenta ■ 
1. .705 

/note=°This part of the cDNA clone may not belong to the 
thromboxane A2 receptor gene. Please refer to Nuesing, 
R.M. et al. (refer ence2) " 
992. .2023; 
/codon_start=l 
/evi dene e= experimental 

/product= "Human thromboxane A2 receptor" 

/db_xref=«PID:dl007852" 

/db_xref = "PID: g53332 6 • 

/ translation "MWPNGSSLGPCFRPTNITLEERRLIASPWFAASFCWGLASNLL 

ALSVLAGARQGGSHTRSSFLTFIXTGLVLTDFLGLLvTGTIWSQHAALFEWHAVDPGC 

RLCRFMGWMIFFGLSPLLLGAAMASERYIjGITRPFSRPAVASQRRAWATVGLWAA^ 

LALGLLPLLGVGRYTVQYPGSWCFLTLGAESGDVAFGLLFSMLGGLSV^ 

VATLCHVYHGQEAAQQRPRDSEVEMMAQLLG IMWASVCWLPLLVF I AQTVLRNPPAM 

SPAGQLSRTTEKELLIYLRVATWNQILDPWVYILFRRAVI^RLQPRLSTRPRSLSLQP 

QLTQRSGLQ" 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



mis cofeature 



CDS 
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repeat_unit 
repeat_unit 
polyA_signal 
polyA_site 



BASE COUNT 
ORIGIN 

1 



521 a 



2221. .2338 
2515. .2636 
2908. .2913 
2932 

/ evidence=experimental 
940 c 777 g 



gtaatgcaga gataataaaa cttcttaggt 
61 aacatggtat acaaattcct ccaaacccaa 
121 aaactttcaa gttagatttt attgctttga 
181 tgtgaagggc aatccttttc ccgtggactg 
241 gggttcatct ccctaataac catcattcac 
301 atgtgagaag gatccacagt tactgtttat 
361 cagtggagtt ggttgcaacc tgatgctaag 
421 agccagtaag taattccctg gcctcgggcc 
481 acaggcagac agcacagtaa ataacactat 
541 atggtatata cccaacagca tcctaggaat 
601 aaggtcaaca cagtcactgt gatgcgtgta 
661 gtcattttta tcttcctaac ttattggaaa 
721 cacagccaga ctgactcagt ttccctggga 
781 tctgcccgcc cccagccctc gccccaccct 
841 agacggcgcc cggacccccg ggcgcgggat 
901 ctctgaaggt gtgcctgaac cagtgccagc 
961 gtggtgactg atccctcagg gctccggagc 
1021 ctgttcccgg cccacaaaca ttaccctgga 
1081 cgccgcctcc ttctgcgtgg tgggcctggc 
1141 gggcgcgcgg caggggggtt cgcacacgcg 
1201 cgtcctcacc gacttcctgg ggctgctggt 
1261 cgcgctcttc gagtggcacg ccgtggaccc 
1321 cgtcatgatc ttcttcggcc tgtccccgct 
1381 ctacctgggt atcacccggc ccttctcgcg 
1441 ggccaccgtg gggctggtgt gggcggccgc 
1501 cgtgggtcgc tacaccgtgc aatacccggg 
1561 gtccggggac gtggccttcg ggctgctctt 
1621 gtccttcctg ctgaacacgg tcagcgtggc 
1681 ggcggcccag cagcgtcccc gggactccga 
1741 catggtggtg gccagcgtgt gttggctgcc 
1801 gcgaaacccg cctgccatga gccccgccgg 
1861 gctcatctac ttgcgcgtgg ccacctggaa 
1921 gttccgccgc gccgtgctcc ggcgtctcca 
1981 gtccctccag ccccagctca cgcagcgctc 
2041 cctcccgcgc ctttccgcgg agcccttggc 
2101 gattcagggg ctgggggtgc tggatggaca 
2161 ccccaatcca acccggggac ccccaactcc 
2221 tcctcggccc ctttttccca tccagagctc 
2281 aggaagggca tgcagacatt ggaagagggt 
2341 gtcttgctct gtcccccagg ctggagtgca 
2401 acctcccggg ttcaagcgat tctcctgcct 
2461 cgcgccacca cgcccggcta atttttgtat 
2521 gccaggctgg tcttgaactc ctgacctcag 
2581 ctgggatcac aggcatgaac caccacacct 
2641 ctcactctgt ggcccagcct ggagtacagt 
2701 ctcccgggtt caagcgattc tcgtgcctca 
2761 agccactgcg cccggccttg catgctcttt 
2821 cagttgcttc cttttgaacc tccaacaggg 
2881 aacgggggca cccccttttc ttgccaaaat 



694 t 

ccataggtct 
taacataatt 
tgagtggctt 
ggatctatag 
atttctcaac 
gactataatt 
gatgtcaaag 
atacccccta 
atattaagaa 
ggagagtctg 
tttccatttt 
agtctcctgt 
ggtcccgctc 
cggcgcccgc 
ccagccaggt 
ctgccctgtc 
catgtggccc 
ggagagacgg 
ctccaacctg 
ctcctccttc 
gaccggtacc 
tggctgccgt 
gctgctgggg 
cccggcggtc 
gctggcgctg 
gtcctggtgc 
ctccatgctg 
caccctgtgc 
ggtggagatg 
ccttctggtc 
gcagctgtcc 
ccagatcctg 
gcctcgcctc 
cgggctgcag 
ccctcggaca 
gtgggcatca 
tccctgatcc 
ccaccccttc 
cttgcattgc 
gtggcgcaat 
cagcctcctg 
ttttagtaga 
gtgattcacc 
ggccattttt 
ggcacgatct 
gcctcccgag 
gaccctgaat 
aaggctctgt 
atatctctgc 



tataataatt 
atagtttcaa 
taaatatgaa 
aaatacagaa 
ctccctaata 
aactagtacc 
ttgtctcggc 
atcttggtca 
aacccaaagc 
tagcaagggc 
gcaaagcatg 
tttgggggcc 
gagcccgtcc 
acatctgcct 
gggagccccg 
tgcagcatcg 
aacggcagtt 
ctgatcgcct 
ctggccctga 
cccaccttcc 
atcgtggtgt 
ctctgtcgct 
gccgccatgg 
gcctcgcagc 
ggcctgctgc 
ttcctgacgc 
ggcggcctct 
cacgtctacc 
atggctcagc 
ttcattgccc 
cgcaccacgg 
gacccctggg 
agcacccggc 
taggaagtgg 
gcccatctgc 
gcagcagggt 
ttttaccaag 
tctgcgtccc 
tatttttttt 
ctcagctcac 
agtagctggg 
gacggggttt 
agcctcagcc 
tttttttttt 
cggctcactg 
cagctgggat 
ttgacctact 
ccagaaagga 
ctttggtttt 



taataaccta 
aaagttcccc 
aagtcttgcc 
atgtgcccag 
accagccacc 
tgggactggt 
ctctgttccc 
gctgattatg 
atatgtatca 
ctccaatgtg 
atctctggtg 
cgcccctggt 
ttcccctccc 
gctcagctcc 
cagatgaggt 
gcctgatggg 
ccctggggcc 
cgccctggtt 
gcgtgctggc 
tctgcggcct 
cccagcacgc 
tcatgggcgt 
cctcagagcg 
gccgcgcctg 
ccctgctggg 
tgggcgccga 
cggtcgggct 
acgggcagga 
tcctggggat 
agacagtgct 
agaaggagct 
tgtatatcct 
ccaggtcgct 
acagagcgcc 
ctgttctgag 
tttgggttga 
cactctccct 
tcccaacccc 
tttagacgga 
tgcaacctcc 
actataggcg 
caccgtgttg 
tcccaaagtg 
tagacggagt 
caacctccgc 
tacaggcgta 
tgctggggta 
ctgaatgtga 
at 
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LOCUS HUMGP3A 3170 bp mRNA PRI 08-NOV-1994 

DEFINITION Human endothelial membrane glycoprotein Ilia (GPIIIa) mRNA, 

complete cds. 
ACCESSION JO 27 03 
NID gl83452 

KEYWORDS glycoprotein; glycoprotein Ilia. 

SOURCE Human umbilical vein endothelial cell, cDNA co mRNA. 

ORGANISM Homo sapiens 

EuJcaryotae mitochondrial eukaryotes; Metazoa,- Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 3170) 

AUTHORS Fitzgerald, L. A. , Steiner,B., Rall,S.C. Jr., Lo, S.S. and 
Phillips, D.R. 

Protein sequence of endothelial glycoprotein Ilia derived from a 
cDNA clone. Identity with platelet glycoprotein Ilia and 



TITLE 
similarity 



JOURNAL 
MEDLINE 
COMMENT 



to 

FEATURES 

source 



sig_peptide 



CDS 



to ' integrin' 

J. Biol. Chem. 262 (9), 3936-3939 (1987) 
87165991 

Draft entry and computer -readable sequence for [1] kindly provided 
by L.A.Fitzgerald, 10-FEB-1987. 

The endothelial membrane glycoprotein Ilia is probably identical 

the platelet glycoprotein Ilia. 
Location /Qualifiers 
1..3170 

/organism="Homo sapiens" 
/db_xref = " taxon : 9 606 " 
/map= fl 17q21.32" 
21. .98 

/gene="ITGB3" 

/not e= "glycoprotein Ilia signal peptide (putative) ; 
putative" 
21. .2387 
/gene="ITGB3° 

/note= "glycoprotein Ilia precursor" 
/codon_start=l 
/db_xref="GDB:G00-120-013" 
/db_xref = " PID : g306786 ■ 

/transiation="MRARPRPRPLWTVLALGALAGVGVGGPNICTTRGVSSCQQCLA 

VS PMCAWC SDEALPLG S PRCDLKENLLKDNC APES I EFPVSEARVLEDRPLSDKGSGD 

SSQVT^VSPQRIALRLRPDDSKNFSIQVTlQv^DYPvmYV^ 

GTKLATQMRKLTSNLRIGFGAFVX>KPVSPYMYISPPEALENPCYDMKTTCLPMFGYKH 
VLTLTDQOTRFNEEVKKQSVSRNRDAPEGGFDAIMQATOCDEKIGWRNDASHLXiW 
DAKTHIALDGRLAGrVQPNDGQCHVGSDNHYSASTTMDYPSLGLMTEI^ 
AVTENVVNLYQNYSELIPGTTVGVLSMDSSNVljQLIVDAY 

SLSFNATCLNNEVIPGI^SOTGLKIGDTVSFSIEAKVRGCPQEKEKSFTIKPVGFKDS 
LIVQVTFDCDCACQAQAEPNSHRCNNGNGTFECGVCRCGPGWLGSQCECSEEDYRPSQ 
QDECSPREGQPVCSQRGECLCGQCVCHSSDFGKITGKYCECDDFSCVRYKGEMCSGHG 
QCSCGDCU:DSDWTGYYCNCTTRTDTCMSSNGLLCSGRGKCECGSCVCIQPGSYGDTC 
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EKC PTC PDACTFKKECVECKKFDREPYMTENTCNRYCRDEI ESVKELKDTGKDAVNCT 

YKNEDIXTWRFQYYEDSSGKSILYWEEPECPKGPDILW^ 

WKLLITIHDRKEFAKFEEERARAKWDTANNPLYKEATSTFTNITYRGT" 
gene 21.. 23 87 

/gene=»ITGB3" 
mat_peptide 99.. 2384 

/ gene= " ITGB3 ■ 
/note=" glycoprotein Ilia* 1 
BASE COUNT 705 a 809 c 909 g 747 t 

ORIGIN 132 bp upstream of SacI site. 

1 cgccgcggga ggcggacgag atgcgagcgc ggccgcggcc ccggccgctc tgggtgactg 

61 tgctggcgct gggggcgctg gcgggcgttg gcgtaggagg gcccaacatc tgtaccacgc 

121 gaggtgtgag ctcctgccag cagtgcctgg ctgtgagccc catgtgtgcc tggtgctctg 

181 atgaggccct gcctctgggc tcacctcgct gtgacctgaa ggagaatctg ctgaaggata 

241 actgtgcccc agaatccatc gagttcccag tgagtgaggc ccgagtacta gaggacaggc 

301 ccctcagcga caagggctct ggagacagct cccaggtcac tcaagtcagt ccccagagga 

361 ttgcactccg gctccggcca gatgattcga agaatttctc catccaagtg cggcaggtgg 

421 aggattaccc tgtggacatc tactacttga tggacctgtc ttactccatg aaggatgatc 

481 tgtggagcat ccagaacctg ggtaccaagc tggccaccca gatgcgaaag ctcaccagta 

541 acctgcggat tggcttcggg gcatttgtgg acaagcctgt gtcaccatac atgtatatct 

601 ccccaccaga ggccctcgaa aacccctgct atgatatgaa gaccacctgc ttgcccatgt 

661 ttggctacaa acacgtgctg acgctaactg accaggtgac ccgcttcaat gaggaagtga 

721 agaagcagag tgtgtcacgg aaccgagatg ccccagaggg tggctttgat gccatcatgc 

781 aggctacagt ctgtgatgaa aagattggct ggaggaatga tgcatcccac ttgctggtgt 

841 ttaccactga tgccaagact catatagcat tggacggaag gctggcaggc attgtccagc 

901 ctaatgacgg gcagtgtcat gttggtagtg acaatcatta ctctgcctcc actaccatgg 

961 attatccctc tttggggctg atgactgaga agctatccca gaaaaacatc aatttgatct 

1021 ttgcagtgac tgaaaatgta gtcaatctct atcagaacta tagtgagctc atcccaggga 

1081 ccacagttgg ggttctgtcc atggattcca gcaatgtcct ccagctcatt gttgatgctt 

1141 atgggaaaat ccgttctaaa gtcgagctgg aagtgcgtga cctccctgaa gagttgtctc 

1201 tatccttcaa tgccacctgc ctcaacaatg aggtcatccc tggcctcaag tcttgtatgg 

1261 gactcaagat tggagacacg gtgagcttca gcattgaggc caaggtgcga ggctgtcccc 

1321 aggagaagga gaagtccttt accataaagc ccgtgggctt caaggacagc ctgatcgtcc 

1381 aggtcacctt tgattgtgac tgtgcctgcc aggcccaagc tgaacctaat agccatcgct 

1441 gcaacaatgg caatgggacc tttgagtgtg gggtatgccg ttgtgggcct ggctggctgg 

1501 gatcccagtg tgagtgctca gaggaggact atcgcccttc ccagcaggac gagtgcagcc 

1561 cccgagaggg tcagcccgtc tgcagccagc ggggcgagtg cctctgtggt caatgtgtct 

1621 gccacagcag tgactttggc aagatcacgg gcaagtactg cgagtgtgac gacttctcct 

1681 gtgtccgcta caagggggag atgtgctcag gccatggcca gtgcagctgt ggggactgcc 

1741 tgtgtgactc cgactggacc ggctactact gcaactgtac cacgcgtact gacacctgca 

1801 tgtccagcaa tgggctgctg tgcagcggcc gcggcaagtg tgaatgtggc agctgtgtct 

1861 gtatccagcc gggctcctat ggggacacct gtgagaagtg ccccacctgc ccagatgcct 

1921 gcacctttaa gaaagaatgt gtggagtgta agaagtttga ccgggagccc tacatgaccg 

1981 aaaatacctg caaccgttac tgccgtgacg agattgagtc agtgaaagag cttaaggaca 

2041 ctggcaagga tgcagtgaat tgtacctata agaatgagga tgactgtgtc gtcagattcc 

2101 agtactatga agattctagt ggaaagtcca tcctgtatgt ggtagaagag ccagagtgtc 

2161 ccaagggccc tgacatcctg gtggtcctgc tctcagtgat gggggccatt ctgctcattg 

2221 gccttgccgc cctgctcatc tggaaactcc tcatcaccat ccacgaccga aaagaattcg 

2281 ctaaatttga ggaagaacgc gccagagcaa aatgggacac agccaacaac ccactgtata 

2341 aagaggccac gtctaccttc accaatatca cgtaccgggg cacttaatga taagcagtca 

2401 tcctcagatc attatcagcc tgtgccagga ttgcaggagt ccctgccatc atgtttacag 

2461 aggacagtat ttgtggggag ggatttcggg gctcagagtg gggtaggttg ggagaatgtc 

2521 agtatgtgga agtgtgggtc tgtgtgtgtg tatgtggggg tctgtgtgtt tatgtgtgtg 

2581 tgttgtgtgt gggagtgtgt aatttaaaat tgtgatgtgt cctgataagc tgagctcctt 

2641 agcctttgtc ccagaatgcc tcctgcaggg attcttcctg cttagcttga gggtgactat 

2701 ggagctgagc aggtgttctt cattacctca gtgagaagcc agctttcctc atcaggccat 

2761 tgtccctgaa gagaagggca gggctgaggc ctctcattcc agaggaaggg acaccaagcc 

2821 ttggctctac cctgagttca taaatttatg gttctcaggc ctgactctca gcagctatgg 

2881 taggaactgc tggcttggca gcccgggtca tctgtacctc tgcctccttt cccctccctc 

2941 aggccgaagg aggagtcagg gagagctgaa ctattagagc tgcctgtgcc ttttgccatc 

3001 ccctcaaccc agctatggtt ctctcgcaag ggaagtcctt gcaagctaat tctttgacct 

3061 gttgggagtg aggatgtctg ggccactcag gggtcattca tggcctgggg gatgtaccag 

3121 catctcccag ttcataatca caacccttca gatttgcctt attggcagcg 
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LOCUS 

DEFINITION 

ACCESSION 
NID 

KEYWORDS 
platelet 

SOURCE 
ORGANISM 



HUMPLG2B 3303 bp mRNA PRI 

Human platelet membrane glycoprotein lib (ITGA2B) 
cds. 
J02764 
g!90067 

membrane adhesive protein; platelet membrane glycoprotein 



07-JAN-1995 
mRNA, complete 



CDNA to mRNA. 



REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



mRNA 



gene 



sig_peptide 



CDS 



receptor. 
Human HEL cell, 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3303) 

Poncz,M., Eisman,R., Heidenreich, R. , Silver,S.M., Vilaire,G., 
Surrey, S., Schwartz, E. and Bennett , J . S . 

Structure of the platelet membrane glycoprotein lib. Homology to 
the alpha subunits of the vitronectin and fibronectin membrane 
receptors 

J. Biol. Chem. 262 (18), 8476-8482 (1987) 
87250457 

Draft entry and computer- readable sequence (1] kindlv provided by 
M.Poncz, 15-APR-1987. 

Location/Qualifiers 
1. .3303 

/organism="Homo sapiens* 
/db_xref =*taxon : 9606 ■ 
/map="17q21.32" 
<1. .3303 
/gene="ITGA2B" 
/note="G00-120-012" 

1. .3303 

/gene="ITGA2B" 

2. .94 

/gene="ITGA2B- 
/note= n G00-120-012" 
2. .3121 
/gene= n ITGA2B" 
/codon_start=l 
/db_xref="GDB:G00-12 0-012 ■ 

/product= "platelet membrane glycoprotein lib" 
/db_xref="PID:gl90068 w 

/ translation= "MARALrCPLQALWLLEWVLLLLGPCAAPPAWALiNLDPVQLTFYAG 

PNGSQFGFSLDFHKDSHGRVAIVVGAPRTLGPSQEETGGVFLCPWRAEGGQCPSLLFD 

LRDETRNVGSQTLQTFKARQGI^ASWSWSDVIVACAPWQHWNVLEKTEEAEKTPVGS 

CFIAQPESGRRAEYSPCRGNTLSRIYVTNDFSWDKRYCEAGFSSWTQAGELVljGAPG 

GYYFLGLLAQAPVADIFSSYRPGILLWHVSSQSLSFDSSNPEYFDGYWGYSVAVGEFD 

GDLNTTEYWGAPTWSWTLGAVEILDSYYQRI^RLRAEQMASYFOT 

HDLLVGAPLYMESRADRKLAEVGRVYLFLQPRGPHALGAPSLLLTGTQLYGRFGSAIA 

PLGDLDRDGYNDIAVAAPYGGPSGRGQVLVFLGQSEGLRSRPSQVLDSPFPTGSAFGF 

SLRGATOIDDNGYPDLIVGAYGANQVAvTRAQPWKASVQIJ^VQDSLNPAVKSCVLPQ 

TKTPVSCFNI QMCVGATGHNI PQKLSLNAEL»QLDRQKPRQGRRVLLLGSQQAGTTLNL 

DLGGKHSPICHTTMAFLJ^EADFRDKLSPIVXSLNV^ 
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EQTRIVLDSGEDDVCVPQLQLTASVTGSPLLVGADNVLELQMDAANEGEGAYEAELAV 
HL PQGAHYMRAL SNVEG F ERL I CNQKKENETR WLC E LGNPMKKNAQ I G I AMLVSVGN 

LEEAGESVSFQLQIRSKNSQNPNSKIVLLDVPVRAEAQVELRGNSFPASLWAAEEGE 

REQNSLDSWGPKVEHTYELHIWGPGTVNGIiHLSIHLPGQSQPSDLLYILDIQPQGGLQ 

CFPQPPVNPLKVDWGLPIPSPSPIHPAHHKRDRRQIFLPEPEQPSRLQDPVLVSCDSA 

PCTWQCDLQEMARGQRAMVTVIAFLWLPSLYQRPLDQF^SHAWFNVSSLPYAVPP 

LSLPRGEAQVOTQLLRALEERAIPIWWVLVGVLGGLLLLTILVI^MWKVGFFKRNRPP 
LEEDDEEGE" 
mat_peptide 95 . .3118 

/gene= n ITGA2B" 
/note="G00-120-012" 

/product= "platelet membrane glycoprotein lib" 
BASE COUNT 618 a 997 c 1026 g 662 t 

ORIGIN Unreported. 

1 gatggccaga gctttgtgtc cactgcaagc cctctggctt ctggagtggg tgctgctgct 
61 cttgggacct tgtgctgccc ctccagcctg ggccttgaac ctggacccag tgcagctcac 
121 cttctatgca ggccccaatg gcagccagtt tggattttca ctggacttcc acaaggacag 
181 ccatgggaga gtggccatcg tggtgggcgc cccgcggacc ctgggcccca gccaggagga 
241 gacgggcggc gtgttcctgt gcccctggag ggccgagggc ggccagtgcc cctcgctgct 
301 ctttgacctc cgtgatgaga cccgaaatgt aggctcccaa actttacaaa ccttcaaggc 
361 ccgccaagga ctgggggcgt cggtcgtcag ctggagcgac gtcattgtgg cctgcgcccc 
421 ctggcagcac tggaacgtcc tagaaaagac tgaggaggct gagaagacgc ccgtaggtag 
481 ctgctttttg gctcagccag agagcggccg ccgcgccgag tactccccct gtcgcgggaa 
541 caccctgagc cgcatttacg tggaaaatga ttttagctgg gacaagcgtt actgtgaagc 
601 gggcttcagc tccgtggtca ctcaggccgg agagctggtg cttggggctc ctggcggcta 
661 ttatttctta ggtctcctgg cccaggctcc agttgcggat attttctcga gttaccgccc 
721 aggcatcctt ttgtggcacg tgtcctccca gagcctctcc tttgactcca gcaacccaga 
781 gtacttcgac ggctactggg ggtactcggt ggccgtgggc gagttcgacg gggatctcaa 
841 cactacagaa tatgtcgtcg gtgcccccac ttggagctgg accctgggag cggtggaaat 
901 tttggattcc tactaccaga ggctgcatcg gctgcgcgca gagcagatgg cgtcgtattt 
961 tgggcattca gtggctgtca ctgacgtcaa cggggatggg aggcatgatc tgctggtggg 
1021 cgctccactg tatatggaga gccgggcaga ccgaaaactg gccgaagtgg ggcgtgtgta 
1 1 !i ttt ^ ttcct 9 cagccgcgag gcccccacgc gctgggtgcc cccagcctcc tgctgactgg 
inn. cacaca ^ ctc tatgggcgat tcggctctgc catcgcaccc ctgggcgacc tcgaccggga 
1201 tggctacaat gacattgcag tggctgcccc ctacgggggt cccagtggcc ggggccaagt 
1261 gctggtgttc ctgggtcaga gtgaggggct gaggtcacgt ccctcccagg tcctggacag 
1321 ccccttcccc acaggctctg cctttggctt ctcccttcga ggtgccgtag acatcgatga 
1381 caacggatac ccagacctga tcgtgggagc ttacggggcc aaccaggtgg ctgtgtacag 
1441 agctcagcca gtggtgaagg cctctgtcca gctactggtg caagattcac tgaatcctgc 
1501 tgtgaagagc tgtgtcctac ctcagaccaa gacacccgtg agctgcttca acatccagat 
1561 gtgtgttgga gccactgggc acaacattcc tcagaagcta tccctaaatg ccgagctgca 
1621 gctggaccgg cagaagcccc gccagggccg gcgggtgctg ctgctgggct ctcaacaggc 
1681 aggcaccacc ctgaacctgg atctgggcgg aaagcacagc cccatctgcc acaccaccat 
1741 ggccttcctt cgagatgagg cagacttccg ggacaagctg agccccattg tgctcagcct 
1801 caatgtgtcc ctaccgccca cggaggctgg aatggcccct gctgtcgtgc tgcatggaga 
1861 cacccatgtg caggagcaga cacgaatcgt cctggactct ggggaagatg acgtatgtgt 
1921 gccccagctt cagctcactg ccagcgtgac gggctccccg ctcctagttg gggcagataa 
J^J tgtcctggag ctgcagatgg acgcagccaa cgagggcgag ggggcctatg aagcagagct 
2041 ggccgtgcac ctgccccagg gcgcccacta catgcgggcc ctaagcaatg tcgagggctt 
oiri tgagagactc atctgtaatc agaagaagga gaatgagacc agggtggtgc tgtgtgagct 
2161 gggcaacccc atgaagaaga acgcccagat aggaatcgcg atgttggtga gcgtggggaa 
ooo tct 3S aa Sr a g gctggggagt ctgtgtcctt ccagctgcag atacggagca agaacagcca 
2281 gaatccaaac agcaagattg tgctgctgga cgtgccggtc cgggcagagg cccaagtgga 
•jir ^ctgcgaggg aactcctttc cagcctccct ggtggtggca gcagaagaag gtgagaggga 
2401 gcagaacagc ttggacagct ggggacccaa agtggagcac acctatgagc tccacaacaa 
2461 tggccctggg actgtgaatg gtcttcacct cagcatccac cttccgggac agtcccagcc 
2521 ctccgacctg ctctacatcc tggatataca gccccagggg ggccttcagt gcttcccaca 
2581 gcctcctgtc aaccctctca aggtggactg ggggctgccc atccccagcc cctcccccat 
2641 tcacccggcc catcacaagc gggatcgcag acagatcttc ctgccagagc ccgagcagcc 
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2701 ctcgaggctt caggatccag ttctcgtaag 
2761 gtgcgacctg caggagatgg cgcgcgggca 
2821 gtggctgccc agcctctacc agaggcctct 
2881 gttcaacgtg tcctccctcc cctatgcggt 
2941 tcaggtgtgg acacagctgc tccgggcctt 
3001 gctggtgggt gtgctgggtg gcctgctgct 
3061 ggtcggcttc ttcaagcgga accggccacc 
3121 atggtgcagc ctacactatt ctagcaggag 
3181 tccaacaagt tgcctccaag ctttgggttg 
3241 tttccctccc aacagagctg ggctaccccc 
3301 ctg 



ctgcgactcg gcgccctgta ctgtggtgca 
gcgggccatg gtcacggtgc tggccttcct 
ggatcagttt gtgctgcagt cgcacgcatg 
gcccccgctc agcctgcccc gaggggaagc 
ggaggagagg gccattccaa tctggtgggt 
gctcaccatc ctggtcctgg ccatgtggaa 
cctggaagaa gatgatgaag agggggagtg 
ggttgggcgt gctacctgca ccgccccttc 
gagctgttcc attgggtcct cttggtgtcg 
cctcctgctg cctaataaag agactgagcc 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 

SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

protease 



HUMTFPB 13865 bp DNA PRI 14-JAN-1995 

Human tissue factor gene, complete cds. 

J02846 

g339505 

Alu repeat; cell surface integral membrane protein; cell surface 
receptor; tissue factor. 

Human DNA, clones lambda-TF [559, 679 , 753 , 885, 1377] . 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo, 
i (bases 1 to 13865) 

Mackman,N., Morrissey, J.H. , Fowler, B. and Edgington,T.S. 
Complete sequence of the human tissue factor gene, a highly 
regulated cellular receptor that initiates the coagulation 



JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



cascade 

Biochemistry 28 (4), 1755-1762 (1989) 
89247359 

Draft entry and computer -readable sequence for [1] kindly provided 
by J.H. Morrissey, 25-OCT-1988. 

Location/ Qualifiers 
1. .13865 
/organism="Homo sapiens" 
/db_xref = " taxon : 9606 " 
/map="lp22-p21" 
prim_transcript 799.. 13232 

/note- "TF mRNA and introns" 
CDS join (922. .1021,2190. .2301,6392. .6591,9289.-9467, 

10075. .10234,11955. .12091) 
/gene="F3 " 

/note=" tissue factor" 
/codon_start=l 
/db_xref="GDB:G00-119-895" 
/db_xref="PID:g339506" 

/ translations "METPAWPRVPRPETAVARTLLLGWVFAQVAGASGTTNTVAAYNL 

TWKSTNFKTILEWEPKFVNQVYTVQISTKSGDTOSKCFYTTDTECDLT^ 

YLARVFSYPAGNVESTGSAGEPLYENSPEFTPYLETNLGQPTIQSFEQVGTKVWTVE 

DERTLVRRNNTFLSLRDWGKDLIYTLYYWKSSSSGKKTAKTNTK^ 

FSVQAVIPSRTVmKSTDSPVECMGQEKGEFREIFYIIGAVVFWIILVIILAISLHK 
CRKAGVGQSWKENSPLNVS ■ 
<922. .1021 
/gene= B F3 ■ 

/note=" tissue factor" 
/ number =1 

join (922. .1021,2190. .2301,6392. .6591,9289. .9467, 
10075. .10234,11955. .12091) 
/gene="F3" 
1022.. 2189 
/note="TF intron A" 
2190. .2301 
/gene="F3 " 
/number =2 
2302. .6391 
/note="TF intron B" 
repeat_region 6127.. 6241 

/note="Alu repeat partial copy A" 
6392.. 6591 
/gene="F3" 
/number =3 



exon 



gene 



intron 



mtron 



exon 
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intron 



repeat_region 



exon 



intron 



exon 



intron 



repeat_region 



exon 



repeat_region 



6592. .9288 
/note="TF intron C" 
8391. .8677 

/note="Alu repeat copy B" 
9289.. 9467 
/gene="F3" 
/number =4 
9468. .10074 
/note= n TF intron D" 
10075. .10234 
/gene= tt F3" 
/number=5 
10235. .11954 
/note=°TF intron E" 
10954. .11249 

/note="Alu repeat copy C" 
11955. .>12091 
/gene="F3" 

/note=" tissue factor - 
/number =6 
12458. .12757 
/note= tt Alu repeat copy D" 
BASE COUNT 3711 a 2955 c 3240 g 3959 t 

ORIGIN 1 bp upstream of EcoRI site; chromosome 1. 

1 gaattctccc agaggcaaac tgccagatgt gaggctgctc ttcctcagtc actatctctg 
61 gtcgtaccgg gcgatgcctg agccaactga ccctcagacc tgtgagccga gccggtcaca 
121 ccgtggctga caccggcatt cccaccgcct ttctcctgtg cgacccgcta agggccccgc 
181 gaggtgggca ggccaagtat tcttgacctt cgtggggtag aagaagccac cgtggctggg 
241 agagggccct gctcacagcc acacgtttac ttcgctgcag gtcccgagct tctgccccag 
gtgggcaaag catccgggaa atgccctccg ctgcccgagg ggagcccaga gcccgtgctt 
tctattaaat gttgtaaatg ccgcctctcc cactttatca ccaaatggaa gggaagaatt 
cttccaaggc gccctccctt tcctgccata gacctgcaac ccacctaagc tgcacgtcgg 
481 agtcgcgggc ctgggtgaat ccgggggcct tgggggaccc gggcaactag acccgcctgc 
541 gtcctccagg gcagctccgc gctcggtggc gcggttgaat cactggggtg agtcatccct 
601 tgcagggtcc cggagtttcc taccgggagg aggcggggca ggggtgtgga ctcgccgggg 
661 gccgcccacc gcgacggcaa gtgacccggg ccgggggcgg ggagtcggga ggagcggcgg 
721 gggcgggcgc cgggggcggg cagaggcgcg ggagagcgcg ccgccggccc tttatagcgc 
781 gcggggcacc ggctccccaa gactgcgagc tccccgcacc ccctcgcact ccctctggcc 
ggcccagggc gccttcagcc caacctcccc agccccacgg gcgccacgga acccgctcga 
tctcgccgcc aactggtaga catggagacc cctgcctggc cccgggtccc gcgccccgag 
accgccgtcg ctcggacgct cctgctcggc tgggtcttcg cccaggtggc cggcgcttca 
1021 ggtgagtggc accagcccct ggaagcccgg ggcgcgccac acgcaggagg gaggcgacag 
1081 tcctggctgg cagcgggctc gccctggttc cccggggcgc ccatgttgtc ccccgcgcct 
iini ac 9Sgactcg gctgcgctca cccagcccgg cttgaatgaa ccgagtccgt cgggcgccgg 
1201 cgggagttgc agggagggag ttggcgcccc agaccccgct gccccttccg ctggagagtt 
1261 ttgctcgggg tgtccgagta attggactgt tgttgcataa gcggactttt agctcccgct 
i^oi ttaactct 9ST ggaaagggct tcccagtgag ttgcgacctt caatatgata ggacttgtgc 
Ha} ct ^ c 9tctgc acgtgttggc gtgcagaggt ttggatatta tctttcatta tatgtgcatc 
1441 ttcccttaat aaagagcgtc cctggtcttt tcctggccat ctttgttcta ggtttgggta 
1501 gaggcaatcc aaaagggctg gattgctgct tagattggag caggtacaac gttgtgcatg 
1561 ccccgtattt ctacgaggtg ttcgggacgg cgtagagact gggacctgct gcgtactggc 
1621 aaagcagacc ttcataagaa ataatcctga tccaatacag ccgacggtgt gacaggccac 
1681 acgtccccgt gggtctctgt ggaagtttca gtgtagcgac atttcagata aaagtggaaa 
1741 aagtgaagtt tggctttttt catttgtatg cagtcctaac tcttgtcaca cgtgtgggat 
1801 ttatcttttt ccataactta ctgaaaaccc ttcctggcgg gctgaacctg actcttcctg 
1861 agctgagtcc tggactggca cactgatggc tctgggctct tcccggtcaa gttataacaa 
1921 ggctttgccc atgaataatt tcaaacgaaa atgtcaagat ccttgccggt gtcctgggat 
1981 tacaaggtga atcttgtcat gaagaaattc taggtctaga aaaaatttga agattctttt 
2041 tctcttgata attcactaat gaagcttttg tggttgaaaa ataaaaagtg aggtttatgg 
2101 tgatgtcagg tgggaaggtg ttttatacat caatacattc gagtgctctg aagtgcatgt 
2161 aataatagct gtttctctgt tgtttaaagg cactacaaat actgtggcag catataattt 
2221 aacttggaaa tcaactaatt tcaagacaat tttggagtgg gaacccaaac ccgtcaatca 
2281 agtctacact gttcaaataa ggtaagctgg gtacagaaaa agaaaattaa ggtctttgat 
2341 gtttctactg tcctatgctg aacaagaatg tctttaaagc tgattactgg atgaaattat 
2401 ttaacagatg acgaagaaga agggattctt ggcaattcgc tggccggtgt catactctat 
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2461 taggcctgca acatttccag accttaaact 
2521 ttggaaatga tgggagagtt cctaagtgga 
2581 agtaggcact gaagtgtgct ttgggtcatg 
2641 tgtctttttc cgttgctgtc tagactgtga 
2701 ggaggaatcc caatgtatac attgccctta 
2761 ccatgaatcg aaatctggta gaatacatga 
2821 actgagcctg gcagagcaga aatactctgc 
2881 tgcttcttgg tgcttcaact ctgactggca 
2941 ttcaatctaa aggttatgac ttccttgatg 
3001 tttttgaaat gttctaggag gcttggtaga 
3061 gaaattattt taatgctaat tacataaaag 
3121 tattttgctg ttctgttttg ttttagcttg 
3181 acttctagat aacgatgcat cttttaagtg 
3241 gacagtagtt gccaaaccag caaggagaac 
3301 atgtatgggg gtggggggag agaaagatga 
3361 agttctggtc aaacttgtca attcagattt 
3421 tgatacaggc ctgaagttta ccttagtaaa 
3481 tttgggagga atgcttacct cctaaatata 
3541 tatatttatg attcatctgc tttttaaaca 
3601 tttataaggc tgctgttatt taaatgagca 
3661 gggctacagc ttgggggatg ccagccgact 
3721 tgctgctgta ctggagggcc tgggagcttt 
3781 ttctcctgcc caccccagga ataaatgaga 
3841 tttacagttg aggaaactgt tgctctgaga 
3901 ggtgagtgcc catgtcaggt ctggaaccaa 
3961 ctcaggtggc tctgccacag tctgatggga 
4021 ttgcccactg catctcctca gttggccttc 
4081 gcatcttaag cagctgcctc tcttccctcc 
4141 agccgcagga cactactgct gtgcagaagc 
4201 cctttgctaa cagttttcag tggtggttgg 
4261 accgtcaccg gtgatattca ttccatggaa 
4321 agcttctgga aaacaacctg caaccaaatt 
4381 tccaaatcag agggttttgc aatgcctgga 
4441 ctattaatgg cattcagagg gattttctac 
4501 gttttactac ttaccagggt actgtataaa 
4561 tggtccctgc tgtgagctgg gaggaaccaa 
4621 ctaggagact ttctcctgtt atctgaacaa 
4681 catagtctca ttcacttttt gaaatggaaa 
4741 ggaacaaaat accctctcta cttttatcac 
4801 ttcagtatca atcttagttt gtgcacttta 
4861 tggcctggtt acttagttca gattttgaaa 
4921 tttagacaat ggaatccatg tggtgcctcg 
4981 tgtaaatgca aaccatctaa tagtcagcga 
5041 acacaagggc atgcagccct cgtaccaggc 
5101 gaaactcatg ctgggggaca ggggagggag 
5161 ttcctggagc aggtggagtt gggacctggc 
5221 gtaatgccaa agggaagagc agcataactg 
5281 caagttgcag tgacgcttca cctatttatt 
5341 agtagaagtc ctttaaatca tttccccttc 
5401 ttagcttttt agtctcagac tttattagac 
5461 tttgttggga tggattcaca tcttgcaaag 
5521 cagacccagc tctgccactc gttagatatg 
5581 agcttcagtg tcctcatgga taagaaagat 
5641 tgaagtaaca tgagtaaagg gtccagcaga 
5701 attaataata ttattaatag tggtcatgag 
5761 ctcactatat agactctatt ctacatagaa 
5821 taagtagact atagtaaaca acctcacttt 
5881 gctctttctc tcctgttacc ctgacagaga 
5941 caaaatggtt gagtacagat ccaagagtca 
6001 tgattttgag ctagtcaccc aatctcactt 
6061 gcaaattaca gagccatccc ctgggttgct 
6121 cggtgttgct aggtatgatg gctcacacct 
6181 gaaggatcag cctgggcaac atagcaggac 
6241 agcaaagtgc tcagcacagt gactgcatca 
6301 gcacagaaca ccacagccag gaagcagtct 
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gatagaacat tttaattgtt ttaattgttt 
gtataaactg tggagagatg aaccatcttg 
atagattaat taatctcatc taaacattga 
acaatgtcta acaccttagg gaagaggtgg 
agcagtgttt gattcattca tctttggact 
tcttagtgga ggaggccaaa tgcgtgactc 
tgtctgcacc ctctgggtct ggtgtggctc 
gctgtcccca ggaggcgata attcagcatg 
gttttcacca . tattcttggc aagtttttgg 
gatcttatga aatagagaat agctgctgtg 
tacaaaagta gcactagcta aaacaaaagg 
tgccaggcct tttacagcat taggaatgca 
aatgttcttg tttttcaaaa tgaacttcat 
ttgcatgcat acgtgcatgc atgtgtggat 
aggaatttca taacatgaaa taatgattac 
caccaattga gaattagtaa gtaatttctc 
cactttactt ccatatggta aaaattagat 
ttcaatctaa tatttgagga cacatgggaa 
taagcctttg ttaactgtaa gttcttgaac 
cagctcctga tctgcaaaca gcagagcgca 
cagggtggtc ctgtggactg aacaatctct 
tccatcagcc tcggcctgag gtgtgcactc 
ttcctggtta aaaaggacca gagcagtcat 
agtgagggat ttattcatga ctacactgat 
agtctaccca gtatccacac accaccatcc 
ggctccaaag cgggaggaag aaggaaagtc 
ctctctgcct gttttccctc cctacagtta 
cgactgctct cactactgca gcctggctcc 
ccctacttgg aactccaact gcatttttca 
gaaatgttat tggcttaagc cttagcacaa 
atgttctgaa ttctaaagct gaatttacaa 
agtgactgaa ttttttagtt aactcaaaat 
ggaaccttgg aggcttttaa agtgttaatg 
agaattgtcc cttcattacc tgtttataca 
tccttgtgct aaattttgct atagagtatg 
atactgtatc tctatgttac atagaaagcc 
ctatttgctg tactgataaa aaggaaacag 
tgataaaata aaacacattt tggtcattcg 
ataaaattaa ataaatagaa accaaaatat 
ggataaagaa tgtgtttacc caaatccttt 
gaaaatatat ttgtggcttt tatgtgtgaa 
ttttccctga gattatgtat taattcaacc 
gaccctatag ccctgctgct taatgggggc 
agactgtgtt catattaaca gcatcgtgga 
atgtaaatgc tcagcaggga gatctggaga 
cttgaacgat gggtctggct ctggcagtca 
tcactttcca tgggacagaa gtgtgtgaat 
attttggtca tttagaagaa tttcattgtc 
agtgacgtct cacaaaaaaa agatctgtct 
agatactacc tgtactctta ttctgtaatc 
gaagggaggc atgtagtata atggggcaaa 
tgaccttctg caagttgctt agtgcctgtg 
ccaacacctt cttggaagga ttatatcaaa 
atacctggca tatagtggag tcaatgaatg 
agatatatgt ataacatgtt attatgtaga 
tatagaacat tatataacaa acaactataa 
gtctcagttg cctcatcttg atggaaaact 
gcgtctacat tctaaaagaa agatatttaa 
aatagctgtc tggttcaaag tccagctgtg 
tgtctcagta gccttatttg taaaaacaag 
atgaggactc aaacatgcat cccaagtgct 
gtacattcag cactttggga ggccgaagca 
cccatctcta caaaacaatg tttaaaaaaa 
ttaggattga ttgtagggct cctgatgtta 
atcttgttgg gtgcaaattg taacattcca 
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6361 tttatgtttc ttccttcttt tctttcttta 
6421 gcttttacac aacagacaca gagtgtgacc 
6481 agacgtactt ggcacgggtc ttctcctacc 
6541 ctggggagcc tctgtatgag aactccccag 
6601 ttgggctgta ataccgttca ttcttgttag 
6661 ctttaggggc tacaaaatta aaaatattta 
6721 acagccctct tcacacattc cagatgtggt 
6781 ctgatgacag tgtcatcaag taactttctc 
6841 cctcagtaag cggctgaatg tgtgttggga 
6901 ggaaatccac caaggccggg gttttagctt 
6961 tttctgtccc gttatcacac taaaaatccc 
7021 tcaatgagga aagtccatgg tttccctctg 
7081 tttctaatca gttggccatg atttgagttc 
7141 gcctgtcacc ctcgttctgg ttttggaaag 
7201 ctgtaagctg gccctaggag ccagtaaaag 
7261 ttattctttt gccgcaactg tggctctgag 
7321 attgagtaga gtgaaattag cttctcttgt 
7381 gtgagtgtta ggcccagcga gagagaacag 
7441 ggtggacgga caaccaacca accatcctcc 
7501 gggcctgacc ccaggtgaat gtggctgcct 
7561 gacccccagg tgcttctgct tgtgtctttt 
7621 gcagcccctc tggtgactgt ggcatggttg 
7681 ctcatgattc tcttttatat taatagttct 
7741 ttgttggtgc aagatagaag atattttatg 
7801 ggcctgtgaa ttgatgtttg ttttcctgtc 
7861 cattgcagag accccgtggt taaatccggc 
7921 tttctcacag ccctacatat ttttgaacct 
7981 tcagtatagt agccactagc cacatgtggc 
8041 taagtataaa gtacacactg gaatttaaga 
8101 tgattacaca ttaaaatgat tatattccag 
8161 ctgagaggca ccgactccct gtgcagttga 
8221 cttaactact aatagcctac ctatcggttg 
8281 aacagtcaat taacacacat ttttcatgtt 
8341 aagtaagcta gaggaaagaa aatgttatta 
8401 ggtggctcgt gcctgtaatc tcagaacttt 
8461 tcaggagttc aagaccagcc tggccaacat 
8521 aaattagcca ggcgtggttg tgggtgcctg 
8581 gagaatcact tcgacccagg tggaggaggt 
8641 ccggcctggg tgacagagcg agactctgtc 
8701 gaaaagaaaa gaaaagaaag aaggaaggaa 
8761 tttactattg ataaagtgga agtggatcat 
8821 gttgagtagg ctgaggagga ggaggaggag 
8881 tggaggaagt aggaggcggc acacttggtg 
8941 gtggatccac agagttcaaa cccatgttgt 
9001 atatattatt aaaattaatt tcacctgttc 
9061 aaacttaaaa tgacatctga ggctccattg 
9121 tgtcttagga ttcagctcca ggccgccacg 
9181 acatgtttta tatgagagat aattaagttg 
9241 ttgtacagaa ttctttggtt ccaaccaagc 
9301 agccaacaat tcagagtttt gaacaggtgg 
9361 aacggacttt agtcagaagg aacaacactt 
9421 acttaattta tacactttat tattggaaat 
9481 aatttgtttt tatgacctgt tttaaattgt 
9541 ccaattcaaa aatagcagaa cagagttgtt 
9601 gcactgtggg gaggggtgga caacaggcct 
9661 gactctggca gggccccctc ggagacccag 
9721 tctctaaagg tcccgccacg ctcacatttc 
9781 ttgtttttgg ttcaatgcat aatactccct 
9841 aaggggctca atagggttca atatgcctaa 
9901 ttttagcagt gatcaaggga aactgattag 
9961 ctgtgttctt gtaggctttg cttagaacct 
10021 ggggaaagaa ttgactcaga gcccagatga 
10081 gccaaaacaa acactaatga gtttttgatt 
10141 agtgttcaag cagtgattcc ctcccgaaca 
10201 gagtgtatgg gccaggagaa aggggaattc 

FIG. 8D 



gcactaagtc aggagattgg aaaagcaaat 
tcaccgacga gattgtgaag gatgtgaagc 
cggcagggaa tgtggagagc accggttctg 
agttcacacc ttacctggag agtaagtggc 
aaacgtctga acattctcgt gatcttgtgc 
ttcttttttt ctcagaaact ggtatgtatc 
aggaggttca cagaatgtga acttttggag 
ccccagtctg tccccagacc ctgttactgt 
gagggcgggc cagggaagcg ggtagggata 
ttccctatat atatatcatg tatcctgatt 
agttgaggat ttttcccaaa cggtcataaa 
agcccataat tagcctaatt atgctgacct 
cgtgatgtgc cagcacctgc ccagccatct 
gtggaatact ttcctcctca gcctttgccc 
aatgaagaga attcctgtca agtaggagat 
ctaggcaatt tagataaatg catgtagcac 
aaggccagct ggttagaatg aaggtgttgt 
tttctcaagg taggaatggt gaaaagaagg 
tctggtatct actttgaggg ttgaaatagg 
tcccagagcc cccatttgca agaccctcca 
gtggcaccag gcaagaatgc agcagcgtca 
acattcattt cccccctaat taatggcatc 
tgagtttttt tgtaagctac ttcaaatcct 
tgtttgtttt gcatgtgcac acacatattt 
atttaaccaa agcacatgag ataattgagc 
ttctcgaggt accaaggaca tttcctgggc 
aaaatatcgt agtttatgct accaccctgt 
tgttgaccac ttgaaatatg gctaatgctc 
agtgtagaat atctcaaaac ttttttatat 
atatatgcag ttgactcaag caatgcatgg 
aaatccgagt ataacttgac tccccaaaaa 
actgttgact gcagccttac caataagata 
gcgtgtatta tatactgtat tcttacaata 
agaaaattat aaggaaaaga ggctgggcat 
gggatgctaa ggcgggtgga tcacttgagg 
ggtgaaaccc catctctacc aaaaatacaa 
taatcccagc tacttgggag gctgaggcag 
tgcagtgaac tgagattgcg ccactgcact 
taaaaaagaa agggaaagaa agaaaaaaaa 
gagaaagaat tataaggaag agaaaatata 
cataaaggtg ttcatcctcg tcatcttcat 
gaagagcagg ggccacggca ggagaaaaga 
taacttttat ttaaaaaaat ttgcatacaa 
tcaggggtca actgtctttg gttaaataaa 
ctttttactt tttctaatgt gactactaga 
tcttcccctt gggccagcac taccacagaa 
cctgcttctt tcagggagct ggttctatgc 
tcaattgtga taacaaaaca ggatttgact 
tcatttcctt tgtttcagca aacctcggac 
gaacaaaagt gaatgtgacc gtagaagatg 
tcctaagcct ccgggatgtt tttggcaagg 
cttcaagttc aggaaaggtg agcatttttt 
gaatacttgg ttttacaacc catttcttcc 
gagaaggtga tggagtagaa gggggagcgc 
ggtcctacct gtgactctgc actaccctgt 
gttcctcagc caaccggctg gatcaggtca 
tccctctatt gaggatccca ggcacaaaat 
tcctttttct tttactgcag atatcttcta 
attggatctt ctcagtcttg gaaaaggcat 
cgaagtcact tctaatcctt cacgtgtcag 
aggtttttac ttccacagtg acttaataaa 
attaagaact ctatcttttt acagaaaaca 
gatgtggata aaggagaaaa ctactgtttc 
gttaaccgga agagtacaga cagcccggta 
agaggtgagt ggctctgcca gccatttgcc 
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10261 tgggggtatg ggtgctgtgg gtgacttctg 
10321 atacttcctt ggttaaatat tcaggaaaac 
10381 gtttgttttg gttttgattt tgctttggta 
10441 tctgtgttga ttgtgccctt gtattagcag 
10501 gccctctgct gagcactgga tacacaaact 
10561 ttccctgggc attttttcat gcttaaattc 
10621 caacaggaca cagtagacat tcgtgagtac 
10681 gagtctaacc catcaaggga agggattgag 
10741 gtgtatatgg cggacacgtg tgtgtacatg 
10801 tgtgcgagga acagtcccta accggaagtg 
10861 gcaagcctgt gtgtctcgat ccatgcctta 
10921 agaaaagagt caggggatat aaacgatggc 
10981 gtccctgcac tttgggaggc ccagacaggc 
11041 gcctggccaa catggtaaaa gcccatctct 
11101 ttgcacgtgt ctgtagtccc agctactcag 
11161 gggaggcgga ggctgaagtg agctgagatt 
11221 gcgagattcc atctcaaaaa aaaaaaaaag 
11281 tccatgtgaa gatgatattt gaacatttta 
11341 ttattgccac tgacaggaga ggtttctctt 
11401 cctacccaca gccttcagtc attgtcctaa 
11461 ttgtgcacac acacttctct gcttccctgg 
11521 cgccacttcc accagaaggc cttgctactg 
11581 aaatcctggt agcactttgg atctcccact 
11641 attgccatca atctcagcat cgttttaggc 
11701 actacatatc ttttctggac tgtgcattat 
11761 tagccattgt caattactct gaaacgttca 
11821 gtggtggaaa gagtgaaaga aagtcaaatt 
11881 tatgccgtca attttgtcca ctgataaatg 
11941 cctttatttt tcagaaatat tctacatcat 
12001 tgtcatcatc ctggctatat ctctacacaa 
12061 gaaggagaac tccccactga atgtttcata 
12121 ctatattgca ctgtgaccga gaacttttaa 
12181 tatttcggag catgaagacc ctggagttca 
12241 ttagcattct ggttttgaca tcagcattag 
12301 accaattcca agttttaatt tttaacacca 
12361 ttatatattc cgcactcaag gagtaaccag 
12421 tcttaaaaaa tcctgggtgg acttttgaaa 
12481 ggagtcttgc tctgttgccc aggctggagt 
12541 tccgtctctc gggttcaagc aattgtctgc 
12601 tgcgcactac cacgccaagc taatttttgt 
12661 ttggccaggc tggtcttgaa ttcctgacct 
12721 gtgctagtat tatgggcgtg aaccaccatg 
12781 tcaatccatg taggaaagta aaatggaagg 
12841 catatgtcta taatatagtg tttaggttct 
12901 aaaacaattg gcaaactttg tattaatgtg 
12961 accttcctaa tatgctttac aatctgcact 
13021 agagctaact atatttttat aagactacta 
13081 cttaaagctt ctatggttga cattgtatat 
13141 gattttctat ttatgtaggt aatattgttc 
13201 tatactttaa ataaaggtga ctgggaattg 
13261 attatttatg tacaatttgg tgtttgtatt 
13321 gtcagtggct tacaacaacg tatctttttc 
13381 gactgcactt cttctcaatg ttttctcatt 
13441 attagatcag ggcagaggga aaaacaaaaa 
13501 gctttaagcc catctcctac acttctgctc 
13561 tgctactgtc ccaagcaagt gaccaagcct 
13621 aggcacatga cggggcaggg atgtcgtctt 
13681 tgcagacttg gagagatttc ttcccattgg 
13741 aaagaaacat tcttgggatg attgtattga 
13801 tagggagaga tataagtgga atgagatctc 
13861 agctc 



gaggagtagc tccaccctca gggctgggat 
aaactgcctg gaggtttttt gttgttattt 
caaaaaagat tttggacatt tagaaatgtt 
gtgttttctt gagcacctgt catgtgctaa 
gtgtttagga tttagcaaca agtcacagat 
taattctggg ggtggcttct ggaccagctg 
ccactgtggg ctgttgccac agaggctgta 
tatatcaaat atacccacat gcatgcatgt 
catgtgcata tgttgggagc tcaggcccat 
ctgtgggcct tcagactctt gcaggaagct 
cagggaaagt attctgagta ctttcagtga 
ttacgctggg tgtggtggct cacgcctgta 
aaatcacttg aggtcaggag tttgggacca 
actcaaaata caaaaagtag ctgggtgtgg 
gaggttgagg caggagaatt gcttgaacct 
ggaccactgt actccagcct gggtgacaga 
aaacaacgaa aaaagaaatg atggcttagc 
aaacacttta aataaactgt tctctcctgt 
tacctctggt cctgcacccc tctgagccat 
agcctagctc taattccact gcctctcctt 
ccgttctcta tcttggagag gcatttcaaa 
caccaactag ttactatctc ttcttcaccc 
tgcacttagg gttcaccttc cgttataatc 
acttctttcc agccattgtt cttacctcca 
tcagtttatt aaatgcccat taaatgtgtt 
ggttttgaca aattctttcc taatgtaagt 
gcacaaaaat aggatggtgt aatttggggt 
ggatttgagc tctccaagtt gactagatgc 
tggagctgtg gtatttgtgg tcatcatcct 
gtgtagaaag gcaggagtgg ggcagagctg 
aaggaagcac tgttggagct actgcaaatg 
gaggatagaa tacatggaaa cgcaaatgag 
aaaaactctt gatatgacct gttattacca 
tcactttgaa atgtaacgaa tggtactaca 
tggcaccttt tgcacataac atgctttaga 
gtcgtccaag caaaaacaaa tgggaaaatg 
agcttttttt tttttttttt tttttgagac 
gcagtagcac gatctcggct cactgcaccc 
ctcagcctcc cgagtagctg ggattacagg 
attttttagt agagatgggg tttcaccatc 
caggtgatcc acccaccttg gcctcccaaa 
cccagccgaa aagcttttga ggggctgact 
aaattgggtg catttctagg acttttctaa 
tttttttttc aggaatacat ttggaaattc 
ttaagtgcag gagacattgg tattctgggc 
ttaactgact taagtggcat taaacatttg 
tacaaactac agagtttatg atttaaggta 
ataatttttt aaaaaggttt tctatatggg 
tatttgtata tattgagata atttatttaa 
ttactgttgt acttattcta tcttccattt 
agctctacta cagtaaatga ctgtaaaatt 
gcttataata cattttggtg actgtaggct 
ctaggatgca aaccaatgga gaagccccta 
actggtagaa accggcaacc acagcttcaa 
tgtacgtgcc cattgtcact tctgttcaca 
gacaatactt tgtctactgg agtcactgca 
acagggaaga gaaaagataa tgctctctac 
cagtagtttg actaattgga gatgagaaaa 
aacaaaatta ggtaaaagga caatatagga 
tagagtccat taaaagcaag ctagattgag 
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LOCUS 

DEFINITION 

ACCESSION 
NID 

KEYWORDS 
SEGMENT 
SOURCE 
ORGANISM 



HUMCETP7 894 bp DNA PRI 01-NOV-1994 

Human cholesteryl ester transfer protein (CETP) gene, exons 15 and 
16 . 

M32998 J02898 
gl80267 

cholesteryl ester transfer protein. 
7 of 7 
Human DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 894) 

Agellon,L.B. , Quinet,E.M., Gillette, T. G. , Drayna,D.T., Brown, M.L. 
and Tall, A. R. 
Unpublished (1990) 

2 (sites) 

Agellon,L.B. , Quinet,E.M., Gillette, T.G . , Drayna,D.T., Brown, M.L. 
and Tall, A. R. 

Organization of the human cholesteryl ester transfer protein gene 
Biochemistry 29 (6), 1372-1376 (1990) 
90241928 

[2} sites for (1] ; intron/exon boundaries. 
Draft entry and computer-readable sequence for [2] kindly 

by L.B.Agellon, 16-MAR-1990 . 

Location/Qualifiers 
1. .894 

/organism="Homo sapiens" 
/ db_xr e f = " t axon : 9 6 0 6 " 

join(M32992: 388. .1656, M32993:l. . 3446 , M32994 : 1 . . 628 , 
M32995:l. . 399,M32996 ; 1 . .409 ,M32997 : 1 .. 1420, 1 . .342) 
/gene= n CETP" 

join(M32992:388. . 505 ,M32992 : 1408 .. 1522 ,M32993 : 432 .. 566 , 
M32993:654. .724, M32993: 954. .1041, M32993: 2068. .2137, 
M32993 : 2355. .2415, M32993: 3023. .3114, M32994: 166. .345, 
M32995:238. . 288 ,M32996 : 128 . . 292 ,M32997 : 375 . . 442, 
M32997:770. .803, M32997: 1285. .1357,257. .342,523. .597) 
/note=" cholesteryl ester transferase protein precursor" 
/codon_start=l 
/db_xref « ■ PID : gl802 6 9 ■ 

/ translation ■ MLAATVLTLALLGNAHAC SKGTSHEAG IVCRI TKPALLVLNHET 
AKVIQTAFQRASYPDITGEKAMMLLGQVKYGUiNIQISHLSIASSQVELVEAKSIDVS 
IQNVSVVFKGTLKYGYTTAWWl^IDQSIDFEIDSAIDLQINTQLTCDSGRVRTDAPDC 
YLSFHKLLLHLQGEREPGWIKQLFTNFI SFTLKLVLKGQICKEINVI SNIMADFVQTR 
AASILSDGDIGVDISLTGDPVITASYLESHHKGHFIYKNVSEDLPLPTFSPTLLGDSR 
MLYFWFSERWHSIiAKVAFQDGRI^lLSLMGDEFKAVLETWGFNTNQEIFQEWGGFPS 
QAQVTVHCLKMPKI SCQNKGVWNS SVMVKFLF PRPDQQHSVAYTF EEDI VTTVQASY 

SKKKLFLSUJDFQITPKTVSNLTESSSESVQSFLQSMITAVGIPEVMSRLEVVFTALM 

NSKGVSLFDIINPEI ITRDGFLLLQMDFGFPEHLLVDFLQSLS n 
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exon 



intron 



exon 



mat_peptide 



prim_transcript <1..772 

/note="CETP mRNA and introns" 
intron <1. .256 

/gene= n CETP" 
/note=°CETP intron N" 
mat_peptide 257 . . 342 

/gene="CETP B 

/note="cholesteryl ester transferase protein" 
257. .342 
/gene="CETP" 
/note="G00-119-773 - 
/number=15 
343. .522 

/note=°CETP intron 0" 
523. .>597 

/note="cholesteryl ester transferase Drotein precursor" 
/number= 16 
523. .594 

/note="cholesteryl ester transferase protein" 
polyA_signal 756.. 762 
BASE COUNT 178 a 262 c 256 g 198 t 

ORIGIN About 950 bp after segment 6. 

1 ggatgggttg ggagctcaag ttttggggca gaagggaatt ttttttggca gcagagtgca 
61 agccctgccg ccaggcaaac tctgctcttc ctcatcctca gaagcacttc ctcactctgc 
121 taaatcaaag tgaaacgcat gtttacagaa tattggtcca aaagggtctc agcatctccc 
181 actacccagg gtgcagagcc tcgggccggc cttgctcccc aagaagggct gactggggct 
241 ctgtcccctc gcccagggct cgaggtagtg tttacagccc tcatgaacag caaaggcgtg 
301 agcctcttcg acatcatcaa ccctgagatt atcactcgag atgtgagtac aaagcccccc 
361 tcaccagccc ctgttcctgg ggagagaggc ccagacagga ttcctggggt gactgggggc 
421 tgttggggag acagacagag gggcctctac cagcttggct ccctcctggt ggcctgggag 
481 tcagcccagc tcgcccctct ctcctactgc ccctcccttc agggcttcct gctgctgcag 
541 atggactttg gcttccctga gcacctgctg gtggatttcc tccagagctt gagctagaag 
601 tctccaagga ggtcgggatg gggcttgtag cagaaggcaa gcaccaggct cacagctgga 
661 accctggtgt ctcctccagc gtggtggaag ttgggttagg agtacggaga tggagattgg 
721 ctcccaactc ctccctatcc taaaggccca ctggcattaa agtgctgtat cQaagagctg 
781 cggagtcctt cttctgtggc tggcgggtag aggggggggg aagggattgt ctcaccagtg 
841 ccgtccacct cttttcagcc cttccaagca gctgccccca aaccctccaa gctt 
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LOCUS HUMCILA 1431 bp mRNA pri 01-NOV-1994 

DEFINITION Human lipoprotein-associated coagulation inhibitor mRNA, complete 
cds . 
J03225 
gl80545 

lipoprotein-associated coagulation inhibitor. 
Human placenta, cDNA to mRNA, clone lambda-P9. 
ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria,- Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1431) 

Wun,T.C, Kretzmer,K.K. , Girard,T.J., Miletich, J;P. and Broze,G.J. 
Jr. 

Cloning and characterization of a cDNA coding for the 
lipoprotein-associated coagulation inhibitor shows that it 
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of three tandem Kunitz-type inhibitory domains 
J. Biol. Chem. 263 (13), 6001-6004 (1988) 
88198127 

Draft entry and printed copy of sequence for [1] kindly provided 

T. -C.Wun, 19-MAR-1988. 

Location/Qualifiers 
1. .1431 

/organism="Homo sapiens" 
/db_xref = ■ taxon : 9 606 " 
/map= n 2q31-q32.1" 
133. .216 
/gene="TFPI" 

/note=" lipoprotein-associated coagulation inhibitor 



peptide" 
133. .1047 
/gene="TFPI" 

/note= B lipoprotein-associated coagulation inhibitor 

precursor" 

/codon_start=l 

/db_xref="GDB:G0O-127-364" 

/db_xref="PID:gl80546" 

/ trans 1 a t i on = M MI YTMKKVHALWASVCLLLNLAPAPLNADS EEDEEHTI ITDTEL 

PPLKLMHSFCAFKADDGPCKAIMKRFFFNIFTRQCEEFIYGGCEGNQNRFESL£ECKK 

MCTRDNANRIIKTTLQQEKPDFCFLEEDPGICRGYITRYFYNNQTKQCERFKYGGCLG 

NMNNFETLEECKNICEDGPNGFQVDNYGTQLNAVNNSLTPQSTKVPSLFEFHGPSWCL 

TPADRGLCRANENRFYYNSVIGKCRPFKYSGCGGNENNFTSKQECLRACKKGF IQRI S 
KGGLI KTKRKRKKQRVKIAYEEI FVKNM " 



FIG. 10A 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



26/97 



gene 133.. 1047 

/gene= n TFPI H 
mat_peptide 217 . .1044 

/gene="TFPI" 

/note="lipoprotein-associated coagulation inhibitor" 

BASE COUNT 479 a 244 c 267 g 441 t 

ORIGIN 351 bp upstream of Sspl site. 

1 ggcgggtctg cttctaaaag aagaagtaga gaagataaat cctgtcttca atacctggaa 

61 ggaaaaacaa aataacctca actccgtttt gaaaaaaaca ttccaagaac tttcatcaga 

121 gattttactt agatgattta cacaatgaag aaagtacatg cactttgggc ttctgtatgc 

181 ctgctgctta atcttgcccc tgcccctctt aatgctgatt ctgaggaaga tgaagaacac 

241 acaattatca cagatacgga gttgccacca ctgaaactta tgcattcatt ttgtgcattc 

301 aaggcggatg atggcccatg taaagcaatc atgaaaagat ttttcttcaa tattttcact 

361 cgacagtgcg aagaatttat atatggggga tgtgaaggaa atcagaatcg atttgaaagt 

421 ctggaagagt gcaaaaaaat gtgtacaaga gataatgcaa acaggattat aaagacaaca 

481 ttgcaacaag aaaagccaga tttctgcttt ttggaagaag atcctggaat atgtcgaggt 

541 tatattacca ggtattttta taacaatcag acaaaacagt gtgaacgttt caagtatggt 

601 ggatgcctgg gcaatatgaa caattttgag acactggaag aatgcaagaa catttgtgaa 

661 gatggtccga atggtttcca ggtggataat tatggaaccc agctcaatgc tgtgaataac 

721 tccctgactc cgcaatcaac caaggttccc agcctttttg aatttcacgg tccctcatgg 

781 tgtctcactc cagcagacag aggattgtgt cgtgccaatg agaacagatt ctactacaat 

841 tcagtcattg ggaaatgccg cccatttaag tacagtggat gtgggggaaa tgaaaacaat 

901 tttacttcca aacaagaatg tctgagggca tgtaaaaaag gtttcatcca aagaatatca 

961 aaaggaggcc taattaaaac caaaagaaaa agaaagaagc agagagtgaa aatagcatat 

1021 gaagaaattt ttgttaaaaa tatgtgaatt tgttatagca atgtaacatt aattctacta 

1081 aatattttat atgaaatgtt tcactatgat tttctatttt tcttctaaaa tcgttttaat 

taatat gttc attaaatttt ctatgcttat tgtacttgtt atcaacacgt ttgtatcaga 

1201 gttgcttttc taatcttgtt aaattgctta ttctaggtct gtaatttatt aactggctac 

1261 tgggaaatta cttattttct ggatctatct gtattttcat ttaactacaa attatcatac 

1321 taccggctac atcaaatcag tcctttgatt ccatttggtg accatctgtt tgagaatatg 

1381 atcatgtaaa tgattatctc ctttatagcc tgtaaccaga ttaagccccc c 
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mRNA 



PRI 



08-JAN-1995 



serine protease. 

to mRNA, clones lambda-HC1026 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 

COMMENT 
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This 



HUMPRC 1366 bp 

Human protein C, mRNA. 
K02059 
gl90322 

glycoprotein; protease; protein C; 
Human liver, cDNA {library of Woo) 
and lambda-HC1375 . 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1366) 
Foster, D. and Davie, E.W. 

Characterization of a cDNA coding for human protein C 
Proc. Natl. Acad. Sci. U.S.A. 81 (15), 4766-4770 (1984) 
84272714 

Protein C is a precursor to a serine protease called 'activated 
protein C* that has a strong anticoagulant activity. The amino 

sequence as determined from the cDNA indicates that protein C is 
synthesized as a single-chain polypeptide containing the light 
chain and the heavy chain connected by a dipeptide of Lys-Arg. 

precursor peptide is then converted to the light and heavy chains 
by cleavage of two or more internal peptide bonds. The amino acid 
sequence of human protein C shows a high homology with that of the 
bovine molecule. Two clones were sequenced in [1] and shown to 
code for human protein C. Clone lambda-HC1026 covers bp 146-1140, 
and clone lambda-HC1375 covers bp 1-1366. The two cDNA clones had 
a poly-A tail at different positions; both poly-A sites were 
preceded by poly-A signals [1]. 
Location/Qualifiers 
1. .1366 

/organism="Homo sapiens" 
/ db_xr e f = " t axon : 9 6 0 6 * 
/tissue_type=" liver" 
/tissue__lib="of Woo" 
/map="2al3-q21" 
<1. .1366 
/gene= B PROC M 
/note="G00-120-317" 
<1. .1140 
/gene="PROC" 
/note= M G00-120-317" 
1..1366 
/gene="PR0C n 
<1. .277 
/gene="PROC" 
/note="G00-120-317" 
/product= "protein C light chain" 
<1. .1073 
/gene= " PROC ■ 
/note=" . " 
/codon__start=2 
/db_xref="GDB:G00-120-317" 
/product^ "protein C" 
/db_xref="PID:gl90323 B 

/translations "QGHGTCIDGIGSFSCDCRSGWEGRFCQREVSFLNCSLDNGGCTH 
YCLEEVGWRRCSCAPGYKLGDDLLQCHPAVKFPCGRPWKRMEKKRSHLKRDTEDQEDQ 
VDPRLIDGKMTRRGDS PWQWLLDSKKKLACGAVLIHPSWVLTAAHCMDESKKLLVRL 
GEYDLRRWEKl^LDLDIKEvTvlIPNYSKSTTDNDIALLHIAQPATLSQTIV 

FIG. 11A 



FEATURES 

source 



mRNA 



mRNA 



gene 



mat__pepcide 



CDS 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



28/97 



giaereij^qagqetlvtgwg yhs srekeakrnrtfvlnfiki pwphnec s evmsnmv 

senmix:agiix;drqdacegdsggpmvasfhgtwflvglvswgegcgluinygvytkvs 
ryldwihghirdkeapqkswap ■ 

mat_peptide 284 . . 1069 

/gene="PROC" 

/note="G00-120-317" 

/product= -protein C heavy chain" 
mat_peptide 320.. 1069 

/gene="PROC" 

/note="G00-120-317" 

/product= "protein C activated heavy chain" 
BASE COUNT 302 a 388 c 425 g 251 t 

ORIGIN 207 bp upstream of PstI site; chromosome 2ql4-q21. 

1 ccaagggcac ggcacgtgca tcgacggcat cggcagcttc agctgcgact gccgcagcgg 
61 ctgggagggc cgcttctgcc agcgcgaggt gagcttcctc aattgctctc tggacaacgg 
121 cggctgcacg cattactgcc tagaggaggt gggctggcgg cgctgtagct gtgcgcctgg 
181 ctacaagctg ggggacgacc tcctgcagtg tcaccccgca gtgaagttcc cttgtgggag 
241 gccctggaag cggatggaga agaagcgcag tcacctgaaa cgagacacag aagaccaaga 
301 agaccaagta gatccgcggc tcattgatgg gaagatgacc aggcggggag acagcccctg 
361 gcaggtggtc ctgctggact caaagaagaa gctggcctgc ggggcagtgc tcatccaccc 
ctcctgggtg ctgacagcgg cccactgcat ggacgagtcc aagaagctcc ttgtcaggct 
tggagagtat gacctgcggc gctgggagaa gtgggagctg gacctggaca tcaaggaggt 
cttcgtccac cccaactaca gcaagagcac caccgacaat gacatcgcac tgctgcacct 
ggcccagccc gccaccctct cgcagaccat agtgcccatc tgcctcccgg acagcggcct 
tgcagagcgc gagctcaatc aggccggcca ggagaccctc gtgacgggct ggggctacca 
721 cagcagccga gagaaggagg ccaagagaaa ccgcaccttc gtcctcaact tcatcaagat 
781 tcccgtggtc ccgcacaatg agtgcagcga ggtcatgagc aacatggtgt ctgagaacat 
841 gctgtgtgcg ggcatcctcg gggaccggca ggatgcctgc gagggcgaca gtggggggcc 
onl catggtcgcc tccttccacg gcacctggtt cctggtgggc ctggtgagct ggggtgaggg 
ctgtgggctc cttcacaact acggcgttta caccaaagtc agccgctacc tcgactggat 
ccatgggcac atcagagaca aggaagcccc ccagaagagc tgggcacctt agcgaccctc 
JJJJ cctgcagggc tgggcttttg catggcaatg gatgggacat taaagggaca tgtaacaagc 
1141 acaccggcct gctgttctgt ccttccatcc ctcttttggg ctcttctgga gggaagtaac 
1201 atttactgag cacctgttgt atgtcacatg ccttatgaat agaatcttaa ctcctagagc 
1261 aactctgtcg ggtggggagg agcagatcca agttttgcgg ggtctaaagc tgtgtgtgtt 



421 
481 
541 
601 
661 



901 
961 
1021 



1321 gagggggata ctctgtttat gaaaaagaat aaaaaacaca accacg 
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LOCUS HUMLDLR02 144 bp DNA PRI 30-NOV-1994 

DEFINITION Human low density lipoprotein receptor gene, exon 2. 

ACCESSION L0033 6 K02573 

NID gl87078 

KEYWORDS low density lipoprotein receptor-1; repeat region. 

SEGMENT 2 of 18 

SOURCE Human DNA {2) and fetal adrenal gland, cDNA to mRNA, clone pLDLR-2 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo, 
(bases 16 to 138) 



REFERENCE 
AUTHORS 



Schneider , W.J. , Casey, M. L. 



TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



intron 



BASE COUNT 
ORIGIN 



Yamamoto,T., Davis, C.G., Brown, M.S. 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 (1), 27-38 (1984) 
85024898 

2 (bases 1 to 23; 132 to 144) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell, D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer- readable sequence for [1] kindly provided 
by D.Russell, 01-MAR-1985. 

Location/Qualifiers 
1. .144 

/organism="Homo sapiens" 
/ db_xr e f = ■ t axon : 9 6 0 6 " 
/map= tt 19pl3.3- 
<1. .15 
/gene="LDLR" 
/note="LDL intron A" 
16. .138 
/gene="LDLR" 
/note="G00-119-362" 
/number =2 
139. .>144 
/gene="LDLR" 
/note="LDL intron B w 
33 a 33 c 46 g 32 t 

Chromosome 19pl3.2-pl3.1; about 10 kb after segment 1. 
1 tttcctctct ctcagtgggc gacagatgtg aaagaaacga gttccagtgc caagacggga 
61 aatgcatctc ctacaagtgg gtctgcgatg gcagcgctga gtgccaggat ggctctgatg 
121 agtcccagga gacgtgctgt gagt 



exon 



intron 
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LOCUS HUMLDLR04 402 bp DNA PRI 30-NOV-1994 

DEFINITION Human low density lipoprotein receptor gene, exon 4. 
ACCESSION L00338 K02573 
NID gl87080 

KEYWORDS low density lipoprotein receptor-1; repeat region. 
SEGMENT 4 of 18 

SOURCE Human DNA [2] and fetal adrenal gland, cDNA to mRNA, clone pLDLR-2 

[1J. 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 16 to 396) 

Yamamoto,T., Davis, C.G., Brown, M.S., Schneider, W.J. , Casey, M.L., 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 (1) , 27-38 (1984) 
85024898 

2 (bases 1 to 23; 389 to 402) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell, D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer -readable sequence for [1] kindly provided 
by D.Russell, 01-MAR-1985 . 

Location/Qualifiers 
1. .402 

/organism="Homo sapiens" 
/db_xref = " taxon : 9606 n 
/map= M 19pl3.3" 
<1. .15 
/gene="LDLR" 
/note="LDL intron C" 
16.:396 
/gene="LDLR" 
/note=*G00-119-362- 
/number =4 
397. .>402 
/gene^LDLR" 
/note="LDL intron D" 
73 a 131 c 120 g 78 t 

Chromosome 19pl3 .2-pl3 .1; about 2.4 kb after segment 3. 
catccatccc tgcagccccc aagacgtgct cccaggacga gtttcgctgc cacgatggga 
agtgcatctc tcggcagttc gtctgtgact cagaccggga ctgcttggac ggctcagacg 
aggcctcctg cccggtgctc acctgtggtc ccgccagctt ccagtgcaac agctccacct 
gcatccccca gctgtgggcc tgcgacaacg accccgactg cgaagatggc tcggatgagt 
ggccgcagcg ctgtaggggt ctttacgtgt tccaagggga cagtagcccc tgctcggcct 
tcgagttcca ctgcctaagt ggcgagtgca tccactccag ctggcgctgt gatggtggcc 
ccgactgcaa ggacaaatct gacgaggaaa actgcggtat gg 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



intron 



exon 



intron 



BASE COUNT 
ORIGIN 

1 
61 
121 
181 
241 
301 
361 



FIG. 13 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



31/97 



LOCUS 

DEFINITION 
ACCESSION 
NID 

KEYWORDS 
SEGMENT 
SOURCE 



HUMLDLR09 193 bp DMA PRI 30-NOV-1994 

Human low density lipoprotein receptor gene, exon 9. 



[2] and fetal adrenal gland, cDNA to mRNA, clone pLDLR-2 



ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



intron 



exon 



intron 



BASE COUNT 
ORIGIN 



L00343 K02573 
gl87085 

low density lipoprotein receptor-1; repeat region. 
9 of 18 
Human DNA 
[1] . 

Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 16 to 187} 

Yamamoto,T., Davis, C.G., Brown, M.S., Schneider, W.J. , Casey, M.L., 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 (1), 27-38 (1984) 
85024898 

2 (bases 1 to 23; 180 to 193) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell. D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer- readable sequence for (1] kindly provided 
by D.Russell, 01-MAR-1985. 

Location/Qualifiers 
1. .193 

/ organi sm= " Homo sapi ens " 
/ db_xr ef = " taxon : 9 6 0 6 ■ 
/map="19pl3.3* 
<1. .15 
/gene= B LDLR D 
/note="LDL intron H" 
16. .187 
/gene="LDLR" 
/note="G00-119-362" 
/number =9 
188. .>193 
/gene=* , LDLR tt 
/note= tt LDL intron I" 
44 a 64 c 52 g 33 t 

Chromosome 19pl3 . 2-pl3 . 1; about 1.2 kb after segment 8. 
1 tccccggacc cccaggctcc atcgcctacc tcttcttcac caaccggcac gaggtcagga 
61 agatgacgct ggaccggagc gagtacacca gcctcatccc caacctgagg aacgtggtcg 
121 ctctggacac ggaggtggcc agcaatagaa tctactggtc tgacctgtcc cagagaatga 
181 tctgcaggtg age 
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LOCUS 

DEFINITION 
ACCESSION 
NID 

KEYWORDS 
SEGMENT 
SOURCE 



ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



intron 



BASE COUNT 
ORIGIN 



HUMLDLR10 249 bp DNA PRI 30-NOV-1994 

Human low density lipoprotein receptor gene, exon 10. 
L00344 K02573 
gl87086 

low density lipoprotein receptor-1; repeat region. 
10 of 18 

Human DNA [2] and fetal adrenal gland, cDNA to mRNA, clone pLDLR-2 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebra ta; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 16 to 243) 

Yamamoto,T., Davis, C.G., Brown, M.S., Schneider , W.J. , Casey, M.L., 
Goldstein, J. L. and Russell ,D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 (1), 27-38 (1984) 
85024898 

2 (bases 1 to 23; 236 to 249) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell. D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer -readable sequence for [1] kindly provided 
by D.Russell, 01-MAR-1985. 

Location/Qualifiers 
1. .249 

/organism= "Homo sapiens " 
/db__xref = " taxon : 9606 " 
Anap="19pl3.3" 
<1. .15 
/gene= w LDLR" 
/ note= M LDL intron r 
16. .243 
/gene= n LDLR w 
/note="G00-119-362' 
/number =10 
244..>249 
/gene="LDLR" 
/note="LDL intron J n 
51 a 77 c 71 g 50 t 

Chromosome 19pl3 . 2-pl3 .1; about 900 bp after segment 9. 
1 ctcctcctgc ctcagcaccc agcttgacag agcccacggc gtctcttcct atgacaccgt 
61 catcagcagg gacatccagg cccccgacgg gctggctgtg gactggatcc acagcaacat 
121 ctactggacc gactctgtcc tgggcactgt ctctgttgcg gataccaagg gcgtgaagag 
181 gaaaacgtta ttcagggaga acggctccaa gccaagggcc atcgtggtgg atcctgttca 
241 tgggtgcgt 



cron 



FIG. 15 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 PCT/US99/06473 



33/97 



LOCUS 

DEFINITION 
ACCESSION 
NID 

KEYWORDS 
SEGMENT 
SOURCE 



30 
11 



-NOV-1994 



ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



intron 



exon 



incron 



BASE COUNT 
ORIGIN 



HUMLDLR1 1 140 bp DNA PRI 

Human low density lipoprotein receptor gene, exon 
L00345 K02573 
gl87087 

low density lipoprotein receptor-1; repeat region. 
11 of 18 

Human DNA [2] and fetal adrenal gland, cDNA to mRNA, clone pLDLR-2 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 6 to 134) 

Yamamoto,T., Davis, C.G., Brown, M.S., Schneider, W.J. , Casey, M.L., 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 (1), 27-38 (1984) 
85024898 

2 (bases 1 Co 22; 128 to 140) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell, D.W. 
The LDL receptor gene: a mosaic of exons shared wich different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer- readable sequence for [1] kindly provided 
by D.Russell, 01-MAR-1985. 

Location/Qualifiers 
1. .140 

/organisms-Homo sapiens" 
/ db_xr e f = ■ t axon : 9 6 0 6 ■ 
/map="19pl3.3- 
<1. .15 

/gene="LDLR" 
/note="LDL intron J" 
16. .134 
/gene="LDLR" 
/note=-G00-119-362- 
/number =11 
135. .>140 
/gene^LDLR" 
/note="LDL intron K" 
34 a 38 c 37 g 31 t 

Chromosome 19pl3 . 2-pl3 . 1 ; about 2.6 kb after segment 10. 
1 ctgtcctccc accagcttca tgtactggac tgactgggga actcccgcca agatcaagaa 
61 agggggcctg aatggtgtgg acatctactc gctggtgact gaaaacattc agtggcccaa 
121 tggcatcacc ctaggtatgt 



FIG. 16 
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LOCUS 

DEFINITION 
ACCESSION 
NID 

KEYWORDS 
SEGMENT 
SOURCE 



30-NOV-1994 
exon 13. 



clone pLDLR-2 



ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



intron 



exon 



intron 



BASE COUNT 
ORIGIN 



HUMLDLR13 163 bp DNA PRI 

Human low density lipoprotein receptor gene, 
L00347 K02573 
gl87089 

low density lipoprotein receptor-1; repeat region. 
13 of 18 

Human DNA [2] and fetal adrenal gland, cDNA to mRNA, 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 16 to 157) 

Yamamoto,T., Davis, C.G., Brown, M.S., Schneider, W. J . , Casey, M.L., 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 {1) , 27-38 (1984) 
85024898 

2 (bases 1 to 24; 151 to 163) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell , D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer -readable sequence for [1] kindly provided 
by D.Russell, 01-MAR-1985. 

Location/Qualifiers 
1. .163 

/ organism= "Homo sapiens" 
/db_xref = " taxon : 9 6 06 " 
/map="19pl3.3" 
<1. .15 
/gene="LDLR" 
/note="LDL intron L" 
16.. 157 
/ gene= ■ LDLR B 
/note=-G00-119-362 w 
/number=13 
158. .>163 
/ gene = B LDLR* 
/note="LDL intron M" 
43 a 45 c 34 g 41 t 

Chromosome 19pl3 . 2-pl3 . 1 ; about 3 kb after segment 12. 
1 ttgctgcctg tttaggacaa agtattttgg acagatatca tcaacgaagc cattttcagt 
61 gccaaccgcc tcacaggttc cgatgtcaac ttgttggctg aaaacctact gtccccagag 
121 gatatggtcc tcttccacaa cctcacccag ccaagaggta agg. 
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[2] and fetal adrenal gland, cDNA to mRNA, clone pLDLR-2 



LOCUS HUMLDLR15 192 bp DNA PRI 30-NOV-1994 

DEFINITION Human low density lipoprotein receptor gene, exon 15. 
ACCESSION L00349 K02573 
NID gl87091 

KEYWORDS low density lipoprotein receptor-1; repeat region. 
SEGMENT 15 of 18 

SOURCE Human DNA 

[1]. 

Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 16 to 186) 

Yamamoto,T. , Davis, C.G., Brown, M.S., Schneider , W.J. , Casey, M.L., 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 (1), 27-38 (1984) 
85024898 

2 (bases 1 to 23; 179 to 192) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell, D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer -readable sequence for [1] kindly provided 
by D.Russell, 01-MAR-1985. 

Location/Qualifiers 
1. .192 

/organism^Homo sapiens" 
/db_xref = " taxon : 9606 B 
/map="19pl3.3 n 
<1. .15 

/gene="LDLR" 
/note="LDL intron N" 
16. .186 
/gene="LDLR'* 
/note=°G00-119-362" 
/number =15 
187. .>192 
/ gene= " LDLR " 
/note="LDL intron 0" 
46 a 64 c 49 g 33 t 
Chromosome 19pl3 .2-pl3 .1; about 2.8 kb after segment 14. 
1 tatttattct ttcagaggct gaggctgcag tggccaccca ggagacatcc accgtcaggc 
61 taaaggtcag ctccacagcc gtaaggacac agcacacaac cacccggcct gttcccgaca 
121 cctcccggct gcctggggcc acccctgggc tcaccacggt ggagatagtg acaatgtctc 
181 accaaggtaa ag 
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REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
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intron 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 

SEGMENT 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 



HUMLDLR17 179 bp DNA PRI 30-NOV-1994 

Human low density lipoprotein receptor gene, exon 17. 
L00351 K02573 
gl87093 

low density lipoprotein receptor-1; repeat region. 
17 of 18 

Human DNA [3) and fetal adrenal gland, cDNA to mRNA, clone pLDLR-2 

m. 

Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 16 to 173) 

Yamamoto,T., Davis, C.G., Brown, M.S., Schneider , W. J. , Casey, M.L., 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 

27-38 (1984) 



to 101) 
Goldstein, J. L. 



Brown, M.S., Russell, D.W. and 



Cell 39 (1), 
85024898 

2 (bases 57 
Lehrman, M.A. 
Schneider, W.J. 

Internalization-defective LDL receptors produced by genes with 
nonsense and frameshift mutations that truncate the cytoplasmic 
domain 

Cell 41 (3), 735-743 (1985) 
85228224 

3 (bases 1 to 23; 164 to 179) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell, D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Draft entry and computer-readable sequence for [1] kindly provided 
by D.Russell, 01-MAR-1985. 

Location/Qualifiers 
1. .179 

/organism= B Homo sapiens" 
/db_xref= " taxon : 9 606 ■ 
/map="19pl3.3" 
<1. .15 
/gene= u LDLR" 
/note="LDL intron P" 
16. .173 
/gene="LDLR". 
/note="G00-119-362 B 
/number =17 
76. .77 
/ gene= B LDLR" 

/note="ac in wt; aagaac in internalization-defective 
familial hypercholesterolemia [2]" 
174. .>179 
/gene="LDLR" 
/note="LDL intron Q" 
BASE COUNT 42 a 56 c 39 g 42 t 

ORIGIN Chromosome 19pl3.2-pl3.1; about 1.4 kb after segment 16. 

1 tgcctctccc tacagtgctc ctcgtcttcc tttgcctggg ggtcttcctt ctatggaaga 
61 actggcggct taagaacatc aacagcatca actttgacaa ccccgtctat cagaagacca 
121 cagaggatga ggtccacatt tgccacaacc aggacggcta cagctacccc tcggtgagt 
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cDNA to mRNA, clone pLDLR-2 



LOCUS HUMLDLR01 769 bp DNA PRI 30-NOV-1994 

DEFINITION Human low density lipoprotein receptor gene, exon 1. 
ACCESSION L29401 K02573 M10664 N00033 
NID g460288 

KEYWORDS low density lipoprotein receptor-1; repeat region. 
SEGMENT 1 of 18 

SOURCE Human DNA [2] and fetal adrenal gland, 

[11. 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (sites) 

Yamamoto,T. , Davis, C.G., Brown, M.S., Schneider, W.J. , Casey, M.L., 
Goldstein, J. L. and Russell, D.W. 

The human LDL receptor: a cysteine-rich protein with multiple Alu 
sequences in its mRNA 
Cell 39 (1), 27-38 (1984) 
85024898 

2 (bases 1 to 769) 

Sudhof,T.C, Goldstein, J. L. , Brown, M.S. and Russell, D.W. 
The LDL receptor gene: a mosaic of exons shared with different 
proteins 

Science 228 (4701), 815-822 (1985) 
85218750 

Bases 1-769 from Science 228, 815-822 (1985) 
Bases 675-754 from Cell 39, 27-38 (1984) 

Draft entry and computer- readable sequence for (1] kindly provided 
by D.Russell, 01-MAR-1985. 

Loca t i on / Qual i f i er s 
1. .769 

/ organism= "Homo sapiens " 
/ db_xr e f = - 1 axon : 9 6 0 6 ■ 
/map="19pl3.3" 
595. .754 
/ gene= " LDLR" 

/note="low density lipoprotein receptor; G00-119-362" 
/ number =1 
688. .750 
/gene= M LDLR" 

/note="low density lipoprotein receotor sianal peDt" 
intron 755..>769 



REFERENCE 
AUTHORS 

TITLE 
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AUTHORS 
TITLE 
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FEATURES 

source 



sig_peptide 



BASE COUNT 
ORIGIN 

1 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 



/gene= w LDLR" 
/note="LDL intron A" 
220 a 169 c 194 g 
Chromosome 19pl3 ,2-pl3.1; 1 bp 
ggatcccaca aaacaaaaaa tatttttttg 
tcctgattga tcagtgtcta ttaggtgatt 
gaaaggaagc taaaaatcta tacacaattc 
gcggaagttc ccaacatttt tagtgttttc 
gctattggag gatcttgaaa ggctgttgtt 
taacagttaa acatcgagaa atttcaggag 
agggggcgtc agctcttcac cggagaccca 
gacactttcg aaggactgga gtgggaatca 
tcggccgttc gaaactcctc ctcttgcagt 
gcaaactcct ccccctgcta gaaacctcac 



186 t 

upstream of BamHI site, 
gctgtacttt tgtgaagatt ttatttaaat 
tggaataaca atgtaaaaac aatatacaac 
ctagaaagga aaaggcaaat atagaaagtg 
cttttgaggc agagaggaca atggcattag 
atccttctgt ggacaacaac agcaaaatgt 
gatctttcag aagatgcgtt tccaattttg 
aatacaacaa atcaagtcgc ctgccctggc 
gagcttcacg ggttaaaagc cgatgtcaca 
gaggtgaaga catttgaaaa tcaccccact 
attgaaatgc tgtaaatgac gtgggccccg 



agtgcaatcg cgggaagcca gggtttccag ctaggacaca gcaggtcgtg atccgggtcg 



ggacactgcc tggcagaggc tgcgagcatg 
gtcgccttgc tcctcgccgc ggcggggact 



gggccctggg gctggaaatt gcgctggacc 
gcaggtaagg cttgctcca 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 

SEGMENT 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 

MEDLINE 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 

REFERENCE 
AUTHORS 

TITLE 

lymphocytes 
JOURNAL 
MEDLINE 

FEATURES 

source 



intron 



exon 



10-NOV-1994 



HUMF511 279 bp DNA PRI 

Human coagulation factor V gene, exon 11. 
L32765 J05368 
g488094 

coagulation factor V; factor V. 
11 of 25 

Homo sapiens DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 279) 
Kane.W.H. and Davie, E.W. 

Cloning of a cDNA coding for human factor V, a blood coagulation 
factor homologous to factor VIII and ceruloplasmin 
Proc. Natl. Acad. Sci. U.S.A. 83 (18), 6800-6804 (1986) 
86313665 

2 (bases 1 to 279) 

Kane,W.H., Ichinose,A. , Hagen,F.S. and Davie, E.W. 

Cloning of cDNAs coding for the heavy chain region and connecting 

region of human factor V, a blood coagulation factor with four 

types of internal repeats 

Biochemistry 26 (20), 6508-6514 (1987) 

88107560 

3 (bases 1 to 279) 

Jenny, R. J., Pittman, D.D. , Toole, J. J., Kriz,R.W., Aldape,R.A., 
Hewick,R.M., Kaufman, R.J. and Mann, K.G. 

Complete cDNA and derived amino acid sequence of human factor V 
Proc. Natl. Acad. Sci. U.S.A. 84 (14), 4846-4850 (1987) 
87260886 

4 (bases 1 to 279) 

Cr ipe , L.D. , Moore , K.D. 
Structure of the gene 
Biochemistry 31 (15) , 
92232668 

5 (bases 1 to 279) 
Shen,N.L., Fan,S.T., 
Edgington , T . S . 

The serine protease cofactor factor V is synthesized by 



and Kane,W.H. 
for human coagulation factor 
3777-3785 (1992) 



Pyati,J., Graff ,R., LaPolla,R.J. and 



J. Immunol. 150 (7), 2992-3001 (1993) 
93203619 

Location/Qualifiers 
1. .279 

/ organi sm= " Homo sapi ens " 
/db__xref =" taxon: 9606 " 
/ tissue_type= "placenta ■ 
/cell_type= " fibroblast ■ 
/map=-lq21-q25 n 

order (L32764: 277. ,>319,<1. .74) 
/gene= w F5" 

/note="3.1 kb gap; G00-119-896 ■ 

/number=10 

75. .225 

/gene="F5" 

/note="G00-119-896" 

/number =11 

52 c 61 g 93 t 



73 a 



BASE COUNT 
ORIGIN 

1 tctgagttct ctattctgtt ccattggtct atgcgtctgt tcttgtacca gtactatact 

61 gttttgtcct ccagagggca gcagacatcg aacagcaggc tgtgtttgct gtgtttgatg 

121 agaacaaaag ctggtacctt gaggacaaca tcaacaagtt ttgtgaaaat cctgatgagg 

181 tgaaacgtga tgaccccaag ttttatgaat caaacatcat gagcagtaao tcagagtact 

241 atttttgttc atcagttttt cattcctgtg gttgaaata 
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low 



the 



HUMHMGCOA 2904 bp mRNA PRI 08-NOV-1994 

Human 3-hydroxy-3-methylglutaryl coenzyme A reductase mRNA, 

complete cds. 

M11058 

gl84243 

3-hydroxy-3-methylglutaryl coenzyme A reductase; giycoDrotein. 
Human fetal adrenal gland, cDNA zo mRNA, library of T.Maniatis, 
clone pHRed-102. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 2904) 
Luskey,K.L. and Stevens, B. 

Human 3-hydroxy-3-methylglutaryl coenzyme A reductase. Conserved 
domains responsible for catalytic activity and sterol-regulated 
degradation 

J. Biol. Chem. 260 (18) , 10271-10277 (1985) 
85261451 

Draft entry and sequence in computer readable form for [1] kindly 
provided by K.L.Luskey, 16-JAN-1986. 

HMG-CoA reductase is the rate-limiting enzyme for cholesterol 
synthesis and is regulated via a negative feedback mechanism 
mediated by sterols and non-sterol metabolites derived from 
mevalonate, the product of the reaction catalyzed by reductase. 
Normally in mammalian cells this enzyme is suppressed by 
cholesterol derived from the internalization and degradation of 

density lipoprotein (LDL) via the LDL receptor. Competitive 
inhibitors of the reductase induce the expression of LDL receptors 
in the liver, which in turn increases the catabolism of plasma LDL 
and lowers the plasma concentration of cholesterol, an important 
determinant of atherosclerosis. 

The sequence coding for the highly conserved membrane bound region 
of the_protein is located at positions 51-1067, that coding for 

linker part of the protein at positions 1068-1397 and for the 
strongly conserved water-soluble catalytic part at positions 
1398-2714, 



r EATURES 

source 



Location/Qualifiers 
1. .2904 
/organism=°Homo sapiens" 
/db_xref = " taxon : 9606 " 
/map= " 5ql 3 . 3 -ql4 " 
mRNA <1..>2904 

/note="HMG CoA mRNA" 
gene 51.. 2717 

/gene= B HMGCR" 
CDS 51.. 2717 

/gene="HMGCR" 

/note= R 3-hydroxy-3-methylglutaryl coenzyme A reductase" 

/codon_start=l 

/db_xref="GDB:G00-119-312" 

/db_xref="PID:g306865" 

/ translations "MLSRLFRMHGLFVASHPWEVTVGTVTLTICMMSMNMFTGNNKIC 

GWNYECPKFEEDVLSSDIIILTITRCIAILYIYFQFQNLRQLGSKYILGIAGLFTIFS 

SFVTSTWIHFLDKELTGLNEALPFFLLLIDLSRASTLAKFALSSNSQDEVRENIARG 

MAILGPTFTLDALV^CLVIGVGTMSGVllQLEIMCCFGCMSvTiANYFv 

VLELSRESREGRPIWQLSHFARVLEEEENKPNPVTQRV1CMIMSIX3LVXVHAHSRWIAD 
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PSPQNSTADTSKVSI^LDENVSKRIEPSVSLWQFYLSKMISMDIEQVITLSLALLLAV 

KYIFFEQTETESTLSLKNPITSPVVTQKKVPDNCCRREPMLVRNNQKCDSVEEETGIN 

RERKVEVIKPLVAETDTPNRATFWGNSSLLDTSSVLVTQEPEIELPREPRPNEECLQ 

I LGNAEKGAKFLSDAE 1 1 QLVNAKH I PAYKLETLMETHERGVSIRRQLLSKKLS EPS S 

LQYLPYRDYNYSLVMGACCENVIGYMP I PVGVAGPLCLDEKEFQVPMATTEGCLVAST 

NRGCRAIGLGGGASSRVLADGMTRGPVVTUjPRACDSAEVKAWLETSEGFAVIKEAFDS 

TSRFARLQKLHTS I AGRNLYI RFQSRSGDAMGMNMI SKGTEKALSKLHEYF PEMQI LA 

VSGNYCTDKKPAAINWIEGRGKSWCEAVIPAKWREVLKTTTEAMIEVNINKNLVG 

AMAGSIGGYNAHAANIVTAIYIACGQDAAQNVGSSNCITLMEASGPTNEDLYISCTMP 

SIEIGTVGGGTNLLPQQACLQMLGVQGACKDNPGE^ARQLARIVCGTVMAGELSIJ4A^ 

LAAGHLVKSHMIHNRSKINLQDLQGACTKKTA * 
BASE COUNT 822 a 597 c 678 g 807 t 

ORIGIN 27 bp upstream of BamHI site; chromosome 5ql3.3~ql4. 

1 ttcggtggcc tctagtgaga tctggaggat ccaaggattc tgtagctaca atgttgtcaa 

61 gactttttcg aatgcatggc ctctttgtgg cctcccatcc ctgggaagtc atagtgggga 

121 cagtgacact gaccatctgc atgatgtcca tgaacatgtt tactggtaac aataagatct 

181 gtggttggaa ttatgaatgt ccaaagtttg aagaggatgt tttgagcagt gacattataa 

241 ttctgacaat aacacgatgc atagccatcc tgtatattta cttccagttc cagaatttac 

301 gtcaacttgg atcaaaatat attttgggta ttgctggcct tttcacaatt ttctcaagtt 

361 ttgtattcag tacagttgtc attcacttct tagacaaaga attgacaggc ttgaatgaag 

421 ctttgccctt tttcctactt ttgattgacc tttccagagc aagcacatta gcaaagtttg 

481 ccctcagttc caactcacag gatgaagtaa gggaaaatat tgctcgtgga atggcaattt 

541 taggtcctac gtttaccctc gatgctcttg ttgaatgtct tgtgattgga gttggtacca 

601 tgtcaggggt acgtcagctt gaaattatgt gctgctttgg ctgcatgtca gttcttgcca 

661 actacttcgt gttcatgact ttcttcccag cttgtgtgtc cttggtatta gagctttctc 

721 gggaaagccg cgagggtcgt ccaatttggc agctcagcca ttttgcccga gttttagaag 

781 aagaagaaaa taagccgaat cctgtaactc agagggtcaa gatgattatg tctctaggct 

841 tggttcttgt tcatgctcac agtcgctgga tagctgatcc ttctcctcaa aacagtacag 

901 cagatacttc taaggtttca ttaggactgg atgaaaatgt gtccaagaga attgaaccaa 

961 gtgtttccct ctggcagttt tatctctcta aaatgatcag catggatatt gaacaagtta 

1021 ttaccctaag tttagctctc cttctggctg tcaagtacat cttctttgaa caaacagaga 

1081 cagaatctac actctcatta aaaaacccta tcacatctcc tgtagtgaca caaaagaaag 

1141 tcccagacaa ttgttgtaga cgtgaaccta tgctggtcag aaataaccag aaatgtgatt 

1201 cagtagagga agagacaggg ataaaccgag aaagaaaagt tgaggttata aaacccttag 

1261 tggctgaaac agatacccca aacagagcta catttgtggt tggtaactcc tccttactcg 

1321 atacttcatc agtactggtg acacaggaac ctgaaattga acttcccagg gaacctcggc 

1381 ctaatgaaga atgtctacag atacttggga atgcagagaa aggtgcaaaa ttccttagtg 

1441 atgctgagat catccagtta gtcaatgcta agcatatccc agcctacaag ttggaaactc 

1501 tgatggaaac tcatgagcgt ggtgtatcta ttcgccgaca gttactttcc aagaagcttt 

1561 cagaaccttc ttctctccag tacctacctt acagggatta taattactcc ttggtgatgg 

1621 gagcttgttg tgagaatgtt attggatata tgcccatccc tgttggagtg gcaggacccc 

1681 tttgcttaga tgaaaaagaa tttcaggttc caatggcaac aacagaaggt tgtcttgtgg 

1741 ccagcaccaa tagaggctgc agagcaatag gtcttggtgg aggtgccagc agccgagtcc 

1801 ttgcagatgg gatgactcgt ggcccagttg tgcgtcttcc acgtgcttgt gactctgcag 

1861 aagtgaaagc ctggctcgaa acatctgaag ggttcgcagt gataaaggag gcatttgaca 

1921 gcactagcag atttgcacgt ctacagaaac ttcatacaag tatagctgga cgcaaccttt 

1981 atatccgttt ccagtccagg tcaggggatg ccatggggat gaacatgatt tcaaagggta 

2041 cagagaaagc actttcaaaa cttcacgagt atttccctga aatgcagatt ctagccgtta 

2101 gtggtaacta ttgtactgac aagaaacctg ctgctataaa ttggatagag ggaagaggaa 

2161 aatctgttgt ttgtgaagct gtcattccag ccaaggttgt cagagaagta ttaaagacta 

2221 ccacagaggc tatgattgag gtcaacatta acaagaattt agtgggctct gccatggctg 

2281 ggagcatagg aggctacaac gcccatgcag caaacattgt caccgccatc tacattgcct 

2341 gtggacagga tgcagcacag aatgttggta gttcaaactg tattacttta atggaagcaa 

2401 gtggtcccac aaatgaagat ttatatatca gctgcaccat gccatctata gagataggaa 

2461 cggtgggtgg tgggaccaac ctactacctc agcaagcctg tttgcagatg ctaggtgttc 

2521 aaggagcatg caaagataat cctggggaaa atgcccggca gcttgcccga attgtgtgtg 
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2581 ggaccgtaat ggctggggaa ttgtcactta 

2641 aaagtcacat gattcacaac aggtcgaaga 

2701 ccaagaagac agcctgaata gcccgacagt 

2761 aaggactaac ataaaatctg tgaattaaaa 

2821 ataaatgtga tcactgagac agccacttgg 

2881 ctttccatgc agactcctca gate 



tggcagcatt ggcagcagga catcttgtca 
tcaatttaca agacctccaa ggagcttgea 
tctgaactgg aacatgggca ttgggttcta 
aagctcaatg cattgtcctg tggaggatga 
tttttggctc tttcagagag gtctcaggtt 



FIG. 22C 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



PRI 



08-JAN-1995 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 

FEATURES 

source 



gene 



42/97 



LOCUS HUMPRCA 11725 bp DNA 

DEFINITION Human protein C gene, complete cds. 
ACCESSION M11228 
NID gl90333 

KEYWORDS glycoprotein; protease; protein C; serine protease. 
SOURCE Human DNA, clones PC -lambda -8 and PC-lamda-6 . 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 11725) 

Foster,D.C, Yoshitake,S. and Davie, E.W. 
The nucleotide sequence of the gene for human protein C 
Proc. Natl. Acad. Sci. U.S.A. 82 (14), 4673-4677 (1985) 
85270390 

Location/Qualifiers 
1. .11725 

/organism= "Homo sapiens ■ 
/ db_xr ef = ■ t axon : 9 6 0 6 • 
/map="2ql3-g21- 
2131. .2200 
/gene="PROC B 
exon <2131..2200 
/gene= w PROC n 

/note=" Protein C; G00-120-317 ■ 
/ number^ 1 

sig_peptide join(2131. .2200,3464. .3519) 

/note= B Protein C signal peptide" 
CDS join (2131. .2200,3464. .3630,5093. .5117,5210. .5347, 

5450. .5584,8253. .8395,9269. . 9386 , 10516 .. 11105 ) 

/note=" Protein C" 

/codon_start=l 

/db_xref="PID:gl90334 B 

/ trans la t ion= "MWQLTSLLLFVATWGISGTPAPLDSVFSSSERAHQVLRIRKRAN 
SFLEELRHSSLERECIEEICDFEEAKEIFQNVDDTLAFWSKHVIX3DQCLVLPLEHPCA 
SLCCGHGTCIDGIGSFSCDCRSGWEGRFCQREVSFLNCSLDNGGCTHYCLEEVGWRRC 
SCAPGYKLGDDLLQCHPAV1CFPCGRPWKRMEKKRSHLKRDTEDQEDQVDPRLIDGKOT 
RRGDSPWQWLLDSKKKLACGAVLIHPSWVLTAAHCMDESKK^ 

ELDLDIKEVTV1IPNYSKSTTDNDIALLHLAQPATLSQTIVPICLPDSGLAERELNQAG 
QETLVTGWGYHSSREKEAKRNRTFVLNFIKI PWPHNECSEVMSNMVSENMLCAGILG 
DRQDACEGDSGGPMVASFHGTOTLVGLVSWGEGCGLLHNYGVTTKVSRYIJDWIHGHIR 



intron 
exon 

mat_peptide 

intron 

exon 

intron 

exon 

intron 



DKEAPQKSWAP " 
2201.. 3463 

/notes "ProC cds intron A* 
3464. .3630 
/number= 2 

join(3520. . 3630, 5093 . . 5117 , 5210 . . 5347, 5450 . . 5584 , 
8253. .8395,9269. .9386,10516. .11102) 
3631. .5092 

/note=°ProC cds intron B- 
5093. .5117 
/number= 3 
5118. .5209 

/note= "ProC cds intron C" 
5210. .5347 
/numbers 4 
5348. .5449 

FIG. 23A 
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exon 



intron 



exon 



intron 



exon 



intron 



exon 



/note="ProC cds intron D" 
5450. .5584 
/number =5 
5585. .8252 

/note="ProC cds intron E" 
8253. .8395 
/number =6 
8396. .9268 

/note="ProC cds intron F" 
9269. .9386 
/number =7 
9387. .10515 

/note= H ProC cds intron G" 



10516. .>11105 
/not e= "Protein C" 
/number =8 

BASE COUNT 2444 a 3298 c 3375 g 2608 t 

ORIGIN 575 bp upstream of StuI site; chromosome 2ql4-q21. 

1 agtgaatctg ggcgagtaac acaaaacttg agtgtcctta cctgaaaaat agaggttaga 
61 gggatgctat gtgccattgt gtgtgtgtgt tgggggtggg gattgggggt gatttgtgag 
121 caattggagg tgagggtgga gcccagtgcc cagcacctat gcactgggga cccaaaaagg 
i81 agcatcttct catgatttta tgtatcagaa attgggatgg catgtcattg ggacagcgtc 
241 ttttttcttg tatggtggca cataaataca tgtgtcttat aattaatggt attttagatt 
301 tgacgaaata tggaatatta cctgttgtgc tgatcttggg caaactataa tatctctggg 
361 caaaaatgtc cccatctgaa aaacagggac aacgttcctc cctcagccag ccactatggg 
421 gctaaaatga gaccacatct gtcaagggtt ttgccctcac ctccctccct gctggatggc 
481 atccttggta ggcagaggtg ggcttcgggc agaacaagcc gtgctgagct aggaccagga 
541 gtgctagtgc cactgtttgt ctatggagag ggaggcctca gtgctgaggg ccaagcaaat 
601 atttgtggtt atggattaac tcgaactcca ggctgtcatg gcggcaggac ggcgaacttg 
661 cagtatctcc acgacccgcc cctgtgagtc cccctccagg caggtctatg aggggtgtgg 
721 agggagggct gcccccggga gaagagagct aggtggtgat gagggctgaa tcctccagcc 
781 agggtgctca acaagcctga gcttggggta aaaggacaca aggccctcca caggccaggc 
ctggcagcca cagtctcagg tccctttgcc atgcgcctcc ctctttccag gccaagggtc 
cccaggccca gggccattcc aacagacagt ttggagccca ggaccctcca ttctccccac 
cccacttcca cctttggggg tgtcggattt gaacaaatct cagaagcggc ctcagaggga 
1021 gtcggcaaga atggagagca gggtccggta gggtgtgcag aggccacgtg gcctatccac 
1081 tggggagggt tccttgatct ctggccacca gggctatctc tgtggccttt tggagcaacc 
tggtggtttg gggcaggggt tgaatttcca ggcctaaaac cacacaggcc tggccttgag 
tcctggctct gcgagtaatg catggatgta aacatggaga cccaggacct tgcctcagtc 
ttccgagtct ggtgcctgca gtgtactgat ggtgtgagac cctactcctg gaggatgggg 
1321 gacagaatct gatcgatccc ctgggttggt gacttccctg tgcaatcaac ggagaccagc 
1381 aagggttgga tttttaataa accacttaac tcctccgagt ctcagtttcc ccctctatga 
1441 aatggggttg acagcattaa taactacctc ttgggtggtt gtgagcctta actgaagtca 
1501 taatatctca tgtttactga gcatgagcta tgtgcaaagc ctgttttgag. agctttatgt 
1561 ggactaactc ctttaattct cacaacaccc tttaaggcac agatacacca cgttattcca 
1621 tccattttac aaatgaggaa actgaggcat ggagcagtta agcatcttgc ccaacattgc 
1681 cctccagtaa gtgctggagc tggaatttgc accgtgcagt ctggcttcat ggcctgccct 
1741 gtgaatcctg taaaaattgt ttgaaagaca ccatgagtgt ccaatcaacg ttagctaata 
}lc^ ttctca ^ ccc agtcatcaga ccggcagagg cagccacccc actgtcccca gggaggacac 
1861 aaacatcctg gcaccctctc cactgcattc tggagctgct ttctaggcag gcagtgtgag 
1921 ctcagcccca cgtagagcgg gcagccgagg ccttctgagg ctatgtctct agcgaacaag 
1981 gaccctcaat tccagcttcc gcctgacggc cagcacacag ggacagccct ttcattccgc 
2041 ttccacctgg gggtgcaggc agagcagcag cgggggtagc actgcccgga gctcagaagt 
2101 cctcctcaga caggtgccag tgcctccaga atgtggcagc tcacaagcct cctgctgttc 
2161 gtggccacct ggggaatttc cggcacacca gctcctcttg gtaaggccac cccaccccta 
2221 ccccgggacc cttgtggcct ctacaaggcc ctggtggcat ctgcccaggc cttcacagct 
2281 tccaccatct ctctgagccc tgggtgaggt gaggggcaga tgggaatggc aggaatcaac 
tgacaagtcc caggtaggcc agctgccaga gtgccacaca ggggctgcca gggcaggcat 
gcgtgatggc agggagcccc gcgatgacct cctaaagctc cctcctccac acggggatgg 
nco - tcacagagtc ccctgggcct tccctctcca cccactcact ccctcaactg tgaagacccc 
2521 aggcccaggc taccgtccac actatccagc acagcctccc ctactcaaat gcacactggc 
2581 ctcatggctg ccctgcccca acccctttcc tggtctccac agccaacggg aggaggccat 
o™, ^attcttggg gaggtccgca ggcacatggg cccctaaagc cacaccaggc tgttggtttc 
2701 atttgtgcct ttatagagct gtttatctgc ttgggacctg cacctccacc ctttcccaag 
2761 gtgccctcag ctcaggcata ccctcctcta ggatgccttt tcccccatcc cttcttgctc 
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2821 acacccccaa cttgatctct ccctcctaac tgtgccctgc accaagacag acacttcaca 
2881 gagcccagga cacacctggg gacccttcct gggtgatagg tctgtctatc ctccaggtgt 
2941 ccctgcccaa ggggagaagc atggggaata cttggttggg ggaggaaagg aagactgggg 
3001 ggatgtgtca agatggggct gcatgtggtg tactggcaga agagtgagag gatttaactt 
3061 ggcagccttt acagcagcag ccagggcttg agtacttatc tctgggccag gctgtattgg 
3121 atgttttaca tgacggtctc atccccatgt ttttggatga gtaaattgaa ccttagaaag 
3181 gtaaagacac tggctcaagg tcacacagag atcggggtgg ggttcacagg gaggcctgtc 
3241 catctcagag caaggcttcg tcctccaact gccatctgct tcctggggag gaaaagagca 
3301 gaggacccct gcgccaagcc atgacctaga attagaatga gtcttgaggg ggcggagaca 
3361 agaccttccc aggctctccc agctctgctt cctcagaccc cctcatggcc ccagcccctc 
3421 ttaggcccct caccaaggtg agctcccctc cctccaaaac cagactcagt gttctccagc 
3481 agcgagcgtg cccaccaggt gctgcggatc cgcaaacgtg ccaactcctt cctggaggag 
3541 ctccgtcaca gcagcctgga gcgggagtgc atagaggaga tctgtgactt cgaggaggcc 
3601 aaggaaattt tccaaaatgt ggatgacaca gtaaggccac catgggtcca gaggatgagg 
3661 ctcaggggcg agctggtaac cagcaggggc ctcgaggagc aggtggggac tcaatgctga 
3721 ggccctctta ggagttgtgg gggtggctga gtggagcgat taggatgctg gccctatgat 
3781 gtcggccagg cacatgtgac tgcaagaaac agaattcagg aagaagctcc aggaaagagt 
3841 gtggggtgac cctaggtggg gactcccaca gccacagtgt aggtggttca gtccaccctc 
3901 cagccactgc tgagcaccac tgcctccccg tcccacctca caaagagggg acctaaagac 
3961 caccctgctt ccacccatgc ctctgctgat cagggtgtgt gtgtgaccga aactcacttc 
4021 tgtccacata aaatcgctca ctctgtgcct cacatcaaag ggagaaaatc tgattgttca 
4081 gggggccgga agacagggtc tgtgtcctat ttgtctaagg gtcagagtcc tttggagccc 
4141 ccagagtcct gtggacgtgg ccctaggtag tagggtgagc ttggtaacgg ggctggcttc 
4201 ctgagacaag gctcagaccc gctctgtccc tggggatcgc ttcagccacc aggacctgaa 
4261 aattgtgcac gcctgggccc ccttccaagg catccaggga tgctttccag tggaggcttt 
4321 cagggcagga gaccctctgg cctgcaccct ctcttgccct cagcctccac ctccttgact 
4381 ggacccccat ctggacctcc atccccacca cctctttccc cagtggcctc cctggcagac 
4441 accacagtga ctttctgcag gcacatatct gatcacatca agtccccacc gtgctcccac 
4501 ctcacccatg gtctctcagc cccagcagcc ttggctggcc tctctgatgg agcaggcatc 
4561 aggcacaggc cgtgggtctc aacgtgggct gggtggtcct ggaccagcag cagccgccgc 
4621 agcagcaacc ctggtacctg gttaggaacg cagaccctct gcccccatcc tcccaactct 
4681 gaaaaacact ggcttaggga aaggcgcgat gctcaggggt cccccaaagc ccgcaggcag 
4741 agggagtgat gggactggaa ggaggccgag tgacttggtg agggattcgg gtcccttgca 
4801 tgcagaggct gctgtgggag cggacagtcg cgagagcagc actgcagctg catggggaga 
4861 gggtgttgct ccagggacgt gggatggagg ctgggcgcgg gcgggtggcg ctggagggcg 
4921 ggggaggggc agggagcacc agctcctagc agccaacgac catcgggcgt cgatccctgt 
4981 ttgtctggaa gccctcccct cccctgcccg ctcacccgct gccctgcccc acccgggcgc 
5041 gcccctccgc acaccggctg caggagcctg acgctgcccg ctctctccgc agctggcctt 
5101 ctggtccaag cacgtcggtg agtgcgttct agatccccgg ctggactacc ggcgcccgcg 
5161 cccctcggga tctctggccg ctgaccccct accccgcctt gtgtcgcaga cggtgaccag 
5221 tgcttggtct tgcccttgga gcacccgtgc gccagcctgt gctgcgggca cggcacgtgc 
5281 atcgacggca tcggcagctt cagctgcgac tgccgcagcg gctgggaggg ccgcttctgc 
5341 cagcgcggtg agggggagag gtggatgctg gcgggcggcg gggcggggct ggggccgggt 
5401 tgggggcgcg gcaccagcac cagctgcccg cgccctcccc tgcccgcaga ggtgagcttc 
5461 ctcaattgct ctctggacaa cggcggctgc acgcattact gcctagagga ggtgggctgg 
5521 cggcgctgta gctgtgcgcc tggctacaag ctgggggacg acctcctgca gtgtcacccc 
5581 gcaggtgaga agcccccaat acatcgccca ggaatcacgc tgggtgcggg gtgggcaggc 
Int} ccctgac 99g cgcggcgcgg ggggctcagg agggtttcta gggagggagc gaggaacaga 
5701 gttgagcctt ggggcagcgg cagacgcgcc caacaccggg gccactgtta gcgcaatcag 
5761 cccgggagct gggcgcgccc tccgctttcc ctgcttcctt tcttcctggc gtccccgctt 
5821 cctccgggcg cccctgcgac ctggggccac ctcctggagc gcaagcccag tggtggctcc 
5881 gctccccagt ctgagcgtat ctggggcgag gcgtgcagcg tcctcctcca tgtagcctgg 
5941 ctgcgttttt ctctgacgtt gtccggcgtg catcgcattt ccctctttac ccccttgctt 
6001 ccttgaggag agaacagaat cccgattctg ccttcttcta tattttcctt tttatgcatt 
6061 ttaatcaaat ttatatatgt atgaaacttt aaaaatcaga gttttacaac tcttacactt 
6121 tcagcatgct gttccttggc atgggtcctt ttttcattca ttttcataaa aggtggaccc 
6181 ttttaatgtg gaaattccta tcttctgcct ctagggcatt tatcacttat ttcttctaca 
6241 atctcccctt tacttcctct attttctctt tctggacctc ccattattca gacctctttc 
6301 ctctagtttt attgtctctt ctatttccca tctctttgac tttgtgtttt ctttcaggga 
6361 actttctttt ttttcttttt ttttgagatg gagtttcact cttgttgtcc caggctggag 
6421 tgcaacgacg tgatctcagc tcaccacaac ctccgcctcc tggattcaag cgattctcct 
6481 gccgcagcct cccgagtagc tgggattaca ggcatgcgcc accacgccca gctaattttg 
6541 tgtttttagt agagaagggg tttctccgtg ttggtcaagc tggtcttgaa ctcctgacct 
6601 caggtgatcc acctgccttg gcctcctaaa gtgctgggat tacaggcgtg agccaccgcg 
6661 cccagcctct ttcagggaac tttctacaac tttataattc aattcttctg cagaaaaaaa 
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with 

ACCESSION 
NID 

KEYWORDS 
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REFERENCE 
AUTHORS 
Kohr,W. , 



HUMLCAT 1744 bp mRNA P ri 07-JAN-1995 

Human lecithin-cholesterol acyl transferase mRNA, complete cds, 

5 ' and 3 • flanking DNA sequences . 

M12625 

gl87022 

lecithin cholesterol acyltransf erase. 

a ? Ult liver (librar V °f A.Ullrich and L.Coussens) , cDNA to 
mRNA, clones PL[2 , 4 , 10, 12, 19] , and DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1744) 

McLean, J. , Fielding, C, Drayna,D., Dieplinger , H. , Baer,B., 



TITLE 

JOURNAL 
MEDLINE 
COMMENT 



FEATURES 

source 



Henzel,W. and Lawn, R. 

Cloning and expression of human lecithin-cholesterol 
acyltransferase cDNA 

of^nc^ 1 * AGad * Sci ' U ' S - A - 83 < 8 >' 2335-2339 (1986) 

Draft entry and sequence in computer readable form for [1] kindly 
provided by J. W. McLean, 24-JUL-1986. 

Because only the 5 ' and 3' flanking sequences were determined from 
DNA, it is not known whether this gene contains introns. 
Location/Qualifiers 
1. .1744 
/organism="Homo sapiens" 
/db_xref = " taxon : 9 606 a 
/map= u 16q22.1" 
roRNA <257..1610 

/note="LCAT mRNA t ' 
sig_peptide 268 . . 339 

/gene= n LCAT" 

/note=" lecithin-cholesterol acyltransferase signal 

peptide" 
gene 268.. 1590 

/gene= n LCAT" 
CDS 268.. 1590 

/gene="LCAT" 

/note=" lecithin-cholesterol acyltransferase precursor (EC 

2 . 3 . 1 . 43 ) * 

/codon_start=l 

/db_xref="GDB:G00-119-359° 

/db_xref = - PID : g3 07117 - 

/translation=-MGPPGSPWQWTLL^LIJJPPAAPFWLLNVLFPPHTTPKA^ 

HTRPVILVPGCIX^QLEAKLDKPDWNWMCYRKTEDFFTIV^ 

TRWYNRSSGLVSNAPGVQIRVPGFGKTYSVEYL^ 

ETVRAAPYDWRLEPGQQEEYYRKIAGLVEEMHAAYGKFVFLIGHSLK 

PQAWKDRFIDGFI SLGAPWGGSIKPMLVLASGDNQGIPIMSS IKLKEEQRITTTSPWM 

FPSRMAWPEDHVFISTPSFNYTGRDFQRFFADLHFEEGWYMWLQSRDLLAGLPAPGVE 

VYCLYGVGLPTPRTYIYDHGFPYTDPVGVLYEDGDDTVATRSTELCGLWQGRQPQPVH 

LLPLHGIQHLNMVFSNLTLEHINAILLGAYRQGPPASPTASPEPPPPE" 
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mat_peptide 340. .1587 

/gene="LCAT 0 

/note=" lecithin-cholesterol acyltransf erase" 
BASE COUNT 324 a 589 c 475 g 356 t 

ORIGIN 30 bp upstream of Styl recognition sequence. 

1 tgaggcctga ctttttcaat aaaacattgt gtagttctgg gcctcctgct gccccggctc 
61 tgtttcccct ggcgccaaga gaagaaggcg gaactgaacc caggcccaga gccggctccc 
121 tgaggctgtg cccctttccg gcaatctctg gccacaaccc ccactggcca ggccgtccct 
181 cccactggcc ctagggcccc tcccactccc acaccagata aggacagccc agtgccgctt 
241 tctctggcag taggcaccag ggctggaatg gggccgcccg gctccccatg gcagtgggtg 
301 acgctgctgc tggggctgct gctccctcct gccgccccct tctggctcct caatgtgctc 
361 ttccccccgc acaccacgcc caaggctgag ctcagtaacc acacacggcc cgtcatcctc 
421 gtgcccggct gcctggggaa tcagctagaa gccaagctgg acaaaccaga tgtggtgaac 
481 tggatgtgct accgcaagac agaggacttc ttcaccatct ggctggatct caacatgttc 
541 ctaccccttg gggtagactg ctggatcgat aacaccaggg ttgtctacaa ccggagctct 
601 gggctcgtgt ccaacgcccc tggtgtccag atccgcgtcc ctggctttgg caagacctac 
661 tctgtggagt acctggacag cagcaagctg gcagggtacc tgcacacact ggtgcagaac 
721 ctggtcaaca atggctacgt gcgggacgag actgtgcgcg ccgcccccta tgactggcgg 
781 ctggagcccg gccagcagga ggagtactac cgcaagctcg cagggctggt ggaggagatg 
841 cacgctgcct atgggaagcc tgtcttcctc attggccaca gcctcggctg tctacacttg 
901 ctctatttcc tgctgcgcca gccccaggcc tggaaggacc gctttattga tggcttcatc 
961 tctcttgggg ctccctgggg tggctccatc aagcccatgc tggtcttggc ctcaggtgac 
1021 aaccagggca tccccatcat gtccagcatc aagctgaaag aggagcagcg cataaccacc 
1081 acctccccct ggatgtttcc ctctcgcatg gcgtggcctg aggaccacgt gttcatttcc 
1141 acacccagct tcaactacac aggccgtgac ttccaacgct tctttgcaga cctgcacttt 
1201 gaggaaggct ggtacatgtg gctgcagtca cgtgacctcc tggcaggact cccagcacct 
1261 ggtgtggaag tatactgtct ttacggcgtg ggcctgccca cgccccgcac ctacatctac 
1321 gaccacggct tcccctacac ggaccctgtg ggtgtgctct atgaggatgg tgatgacacg 
1381 gtggcgaccc gcagcaccga gctctgtggc ctgtggcagg gccgccagcc acagcctgtg 
1441 cacctgctgc ccctgcacgg gatacagcat ctcaacatgg tcttcagcaa cctgaccctg 
1501 gagcacatca atgccatcct gctgggtgcc taccgccagg gtccccctgc atccccgact 
1561 gccagcccag agcccccgcc tcctgaataa agaccttcct ttgctaccgt aagccctgat 
1621 ggctatgttt caggttgaag ggaggcacta gagtcccaca ctaggtttca ctcctcacca 
1681 gccacaggct cagtgctgtg tgcagtgagg caagatgggc tctgctgagg cctgggactg 
1741 agct 
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LOCUS HUMHCII 2182 bp mRNA PRI 08-NOV-1994 

DEFINITION Human heparin cof actor II (HC-II) mRNA, comolete cds. 
ACCESSION M12849 M19241 
NID g!83909 

KEYWORDS heparin cof actor II; protease inhibitor. 

SOURCE Human fetal liver, cDNA to mRNA, clone lambda-HCII . 7 [1] ; adult 

liver, cDNA to mRNA, clone lambda HCII.7.1 [3]. 
ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria,- Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1025 to 2182) 
Inhorn,R.C. and Tollef sen, D.M. 
Unpublished (1986) 

2 (bases 1025 to 2182) 
Inhorn,R.C. and Tollef sen, D.M. 

Isolation and characterization of a partial cDNA clone for heparin 
cofactor III 

Biochem. Biophys. Res. Commun. 137 (1), 431-436 (1986) 
86242236 

3 (bases 1 to 2182) 

Blinder, M. A. , Marasa,J.C, Reynolds , C . H . , Deaven,L.L. and 
Tollef sen, D.M. 

Heparin cofactor II: cDNA sequence, chromosome localization, 
restriction fragment length polymorphism, and expression in 
Escherichia coli 

Biochemistry 27 (2), 752-759 (1988) 
88163663 

11] revises [2] . 

Draft entry and computer -readable sequence of [2] kindly provided 
by D.M. Tollef sen, 18-AUG-1986. 

Draft entry and computer -readable sequence of [3] kindly provided 
by Blinder, M. A. 24-MAR-1988. 

Location/Qualifiers 
1. .2182 

/organism= "Homo sapiens" 
/db_xref =" taxon : 9606 ■ 
/map="22qll.2° 
<1. .2182 

/no te= "heparin cofactor II mRNA" 
29. .85 
/gene="HCF2" 

/ no te= "heparin cofactor II signal protein" 
29. .1528 
/gene="HCF2" 
29. .1528 
/gene="HCF2" 

/note=" heparin cofactor II precursor" 
/ codon__start=l 
/db__xref="GDB:G00-120-038" 
/db_xref=°PID:gl83910" 

/ translations "MKHSLNALLIFLIITSAWGGSKGPLDQLEKGGETAQSADPQWEQ 

LNNKNLSMPLLPADFHKENTVTNDWI PEGEEDDDYLDLEKI F SEDDDYIDI VDSLSVS 

PTDSDVSAGNI LQLFHGKSRIQRI^II^JAKFAFNLYRVLKDQVNTFDNI FI APVGI ST 

AMGMISLGLKGETHEQVHSILHFKDFVNASSKYEITTiro 

RSVNDLYI QKQF PI LLDFRTKVREYYFAEAQIADFSDPAF I SKTNNHIMKLTKGLI KD 
AI^IDPATQMMII^CIYFKGSWVmFPVEMTHNHNFRI^ 

ANDQELDCDILQLEWGGISMLIWPHKMSGMKTLEAQLTPRVVERWQKSMTNRTREV 
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1 
61 
121 



LLPKFKLEKNYNLVESLKLMGIRMLFDKNGNMAGISDQRIAIDLFKHQGTITVNEEGT 

QATTVTTVGFMPLSTQVRFTVDRPFLFLIYEHRTSCLLFMGRVANPSRS B 
mat_peptide 86 . . 1525 

/gene="HCF2" 

/ no te= "heparin cof actor II B 
BASE COUNT 603 a 581 c 500 g 498 t 

ORIGIN 142 bp upstream from PstI site; chromosome 22. 

cgaaacacag agctttagct ccgccaaaat gaaacactca ttaaacgcac ttctcatttt 

cctcatcata acatctgcgt ggggtgggag caaaggcccg ctggatcagc tagagaaagg 

aggggaaact gctcagtctg cagatcccca gtgggagcag ttaaataaca aaaacctgag 

181 catgcctctt ctccctgccg acttccacaa ggaaaacacc gtcaccaacg actggattcc 

241 agagggggag gaggacgacg actatctgga cctggagaag atattcagtg aagacgacga 

301 ctacatcgac atcgtcgaca gtctgtcagt ttccccgaca gactctgatg tgagtgctgg 

361 gaacatcctc cagctttttc atggcaagag ccggatccag cgtcttaaca tcctcaacgc 

421 caagttcgct ttcaacctct accgagtgct gaaagaccag gtcaacactt tcgataacat 

481 cttcatagca cccgttggca tttctactgc gatgggtatg atttccttag gtctgaaggg 

541 agagacccat gaacaagtgc actcgatttt gcattttaaa gactttgtta atgccagcag 

601 caagtatgaa atcacgacca ttcataatct cttccgtaag ctgactcatc gcctcttcag 

661 gaggaatttt gggtacacac tgcggtcagt caatgacctt tatatccaga agcagtttcc 

721 aatcctgctt gacttcagaa ctaaagtaag agagtattac tttgctgagg cccagatagc 

781 tgacttctca gaccctgcct tcatatcaaa aaccaacaac cacatcatga agctcaccaa 

841 gggcctcata aaagatgctc tggagaatat agaccctgct acccagatga tgattctcaa 

901 ctgcatctac ttcaaaggat cctgggtgaa taaattccca gtggaaatga cacacaacca 

961 caacttccgg ctgaatgaga gagaggtagt taaggtttcc atgatgcaga ccaaggggaa 

1021 cttcctcgca gcaaatgacc aggagctgga ctgcgacatc ctccagctgg aatacgtggg 

1081 gggcatcagc atgctaattg tggtcccaca caagatgtct gggatgaaga ccctcgaagc 

1141 gcaactgaca ccccgggtgg tggagagatg gcaaaaaagc atgacaaaca gaactcgaga 

1201 agtgcttctg ccgaaattca agctggagaa gaactacaat ctagtggagt ccctgaagtt 

1261 gatggggatc aggatgctgt ttgacaaaaa tggcaacatg gcaggcatct cagaccaaag 

1321 gatcgccatc gacctgttca agcaccaagg cacgatcaca gtgaacgagg aaggcaccca 

1381 agccaccact gtgaccacgg tggggttcat gccgctgtcc acccaagtcc gcttcactgt 

1441 cgaccgcccc tttcttttcc tcatctacga gcaccgcacc agctgcctgc tcttcatggg 

1501 aagagtggcc aaccccagca ggtcctagag gtggaggtct aggtgtctga agtgccttgg 

1561 gggcaccctc attttgtttc cattccaaca acgagaacag agatgttctg gcatcattta 

1621 cgtagtttac gctaccaatc tgaattcgag gcccatatga gaggagctta gaaacgacca 

1681 agaagagagg cttgttggaa tcaattctgc acaatagccc atgctgtaag ctcatagaag 

1741 tcactgtaac tgtagtgtgt ctgctgttac ctagagggtc tcacctcccc actcttcaca 

1801 gcaaacctga gcagcgcgtc ctaagcacct cccgctccgg tgaccccatc cttgcacacc 

1861 tgactctgtc actcaagcct ttctccacca ggcccctcat ctgaatacca agcacagaaa 

1921 tgagtggtgt gactaattcc ttacctctcc caaggagggt acacaactag caccattctt 

1981 gatgtccagg gaagaagcca cctcaagaca tatgaggggt gccctgggct aatgttaggg 

2041 cttaattttc tcaaagcctg acctttcaaa tccatgatga atgccatcag tccctcctgc 

2101 tgttgcctcc ctgtgacctg gaggacagtg tgtgccatgt ctcccatact agagataaat 
2161 aaatgtagcc acatttactg tg 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



HUMFVA 6893 bp mRNA PRI 08-AUG-1995 

Human coagulation factor V mRNA, complete cds . 
M14335 M17785 
gl82797 

coagulation factor V; factor V; glycoprotein. 

Human liver (normal hepatocyte and HepG-2 cells) , cDNA to mRNA, 
clones HV3.37, HV0.85, HV1.66 and HV2.97. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae? Homo. 

1 (bases 3636 to 6893) 
Kane,W.H. and Davie, E.W. 

Cloning of a cDNA coding for human factor V, a blood coagulation 
factor homologous to factor VIII and ceruloplasmin 
Proc. Natl. Acad. Sci. U.S.A. 83 (18), 6800-6804 (1986) 
86313665 

2 (bases 1 to 4876) 

Kane,W.H., Ichinose,A., Hagen,F.S. and Davie, E.W. 
Cloning of cDNAs coding for the heavy chain region and connecting 
region of human factor V, a blood coagulation factor with four 
types of internal repeats 
Biochemistry 26 (20) , 6508-6514 (1987) 
88107560 

Draft entry and computer- readable sequence [1] kindly submitted by 
W.H.Kane, 13-JUN-1988. 

Location/Quali f iers 
1. .6893 
/organism="Homo sapiens" 
/db_xref ="taxon: 9606" 
/map="lq21-q25 B 
gene 77.. 6751 

/gene="F5" 
sig_peptide 77 . . 160 

/gene="F5 " 

/note=" factor V signal peptide" 
CDS 77.. 6751 

/gene="F5" 

/note=° factor V precusor" 
/codon_start=l 
/db_xref= n GDB:G00-119-896" 
/db_xref = B PID: gl8279 8 - 

/translation 3 "MF PGC PRLWVXWLGTSWVGWGSQGTEAAQLRQFYVAAQGI SWS 

YRPEPTNSSI^SVTSFKKIWREYEPYFKKEKPQSTISGLLGPTLYAEVGDIIKVHF 

KNKADKPLSIHPQGIRYSKLSEGASYLDHTFPAEKMDDAVAPGREYTYEWSISEDSGP 

THDDPPCLTHIYYSHENLIEDFNSGLIGPLLICKKGTLTEGGTQKTFDKQIVLLFAVF 

DESKSWSQSSSLMYTraGYTOGTMPDITVCAHDHISWHLLGMSSGPELFSIHFNGQV^ 

EQNHHKVSAITLVSATSTTAl^MTVGPEGKWI I S SLTPKHLQAGMQAYIDI KNC PKKTR 

NLKKITREQRRHMKRWEYFIAAEEVIWDYAPVIPANMDKKYRSQHIJDNFSNQIGKHYK 

KVMYTQYEDESFTKHT\n^NMKEDGILGPIIRAQVRDTLKr^KNMASRPYSIYPHGV 

TFS PYEDEVNSSFTSGRNNTMI RAVQPGETYTYKWNILEFDEPTENDAQCLTRPYYSD 

VDIMRDIASGLIGLLLICKSRSLDRRGIQRAADIEQQAVFAVFDENKSWYLEDNINKF 

CENPDEVKRDDPKFYESNIMSTINGYVPESITTLGFCFDDTVQWHFCSVGTQNEILTI 
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HFTGHSFIYGKRHEDTLTLFPMRGESVTVTTOOTGTWMLTSMNSSPRSKKLRLKFRW 

KCIPDDDEDSYEIFEPPESTVMATRKMHDRLEPEDEESDADYDYQNRLAAAIjGIRSFR 

NSSLNQEEEEFNLTALALENGTEFVSSNTDIIVGSNYSSPSNISKFTVNNLAEPQKAP 

SHQQATTAGSPLRHLIGKNSVLNSSTAEHSSPYSEDPIEDPLQPDVTGIRLLSLGAGE 

FRSQEHAKRKGPKVEREDQAAKHRFSViMKLLAHKVGRHLSQDTGSPSGMRPWEDLPSQD 

TGSPSRMRPWEDPPSDLLLLKQSNSSKILVGRWHLASEKGSYEIIQDTDEDTAVNNWL 

ISPQNASRAWGESTPLANKPGKQSGHPKFPRVRHKSLQVRQDGGKSRLKKSQFLIKTR 

KKKKEKHTHHAPLSPRTFHPLRSEAYNTFSERRLKHSLVLHKSNETSLPTDLNQTLPS 

MDFGWIASLPDHNQNSSNDTGQASCPPGLYQTVPPEEHYQTFPIQDPDQMHSTSDPSH 

RSSSPELSEMLEYDRSHKSFPTDISQMSPSSEHEVWQTVISPDLSQVTLSPELSQTNL. 

SPDLSHTTLSPELIQRNLSPALGQMPISPDLSHTTLSPDLSHTTLSLDLSQTNLSPEL 

SQTNLSPALGQMPLSPDLSHTTISLDFSQTNLSPELSHMTLSPELSQTNLSP/UjGQMP 

ISPDLSHTTLSLDFSQTNLSPELSQTNLSPALGQMPLSPDPSHTTLSLDLSQTNLSPE 

LSQTNLSPDLSEMPLFADLSQIPLTPDLDQMTLSPDLGETDLSPNFGQMSLSPDLSQV 

TLSPDISDTTLLPDLSQISPPPDLDQIFYPSESSQSLLLQEFNESFPYPDLGQMPSPS 

SPTLNOTFLSKEFNPLVIVGLSKDGTDYIEIIPKEEVQSSEDDYAEIDYVPYDDPYKT 

DVRTNINSSRDPDNIAAWYLRSNNGNRRNYYIAAEEISWDYSEFVQRETDIEDSDDIP 

EOTTYKKVVFRKYLDSTFTKRDPRGEYEEHUSIUSPIIRAEVDDVIQVRFKNI^ 

SLHAHGLSYEKSSEGKTYEDDSPEWFKEDNAVQPNSSYTYVWHATERSGPESPGSACR 

AWAYYSAVNPEKDIHSGLIGPLLICQKGILHKDSNMPVDMREFVLLFMTFDEKKSWYY 

EKKSRSSWRLTSSEMKKSHEFHAINGMIYSLPGLKMY^QEWVRLHLLNIGGSQDIHVV 

HFHGQTLLENGNKQHQLGWPLLPGSFKTLEMKASKPGWWLLNTEVGENQRAGMQTPF 

LIMDRIXTRMPMGLSTGIISDSQIKASEFLGYWEPRL^^ 

SKPWIQVDMQKEVIITGIQTQGAKHYLKSCYTTEFYVAYSSNQINWQIFKGNSTRNVM 
YFNGNSDASTIKENQFDPPIVARYIRISPTRAYNRPTLRLELQGCEVNGCSTPLGMEN 
GKIENKQITASSFKKSWWGDYWEPFRARIJ^QGRVN^^ 

ITAIITQGCKSLSSEMYVKSYTIHYSEQGVEWKPYTILKSSMVDKIFEGNTNTKGHVKN 
FFNPPIISRFIRVIPKTWNQSIALRLELFGCDIY W 
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mat_peptide 161 . . 6748 

/gene= tt F5 " 

/note=" factor V" 
variation 3723 . . 4024 

/gene^FS" 

/note="ccctt in clone HV2.97 [1]" 
/replace- " ccctf 
BASE COUNT 2090 a 1700 c 1423 g 1680 t 

ORIGIN 270 bp upstream of AccI site; chromosome Iq21-q25. 

1 ctccgggctg tcccagctcg gcaagcgctg cccaggtcct ggggtggtgg cagccagcgg 
61 gagcaggaaa ggaagcatgt tcccaggctg cccacgcctc tgggtcctgg tggtcttggg 
121 caccagctgg gtaggctggg ggagccaagg gacagaagcg gcacagctaa ggcagttcta 
181 cgtggctgct cagggcatca gttggagcta ccgacctgag cccacaaact caagtttgaa 
241 tctttctgta acttccttta agaaaattgt ctacagagag tatgaaccat attttaagaa 
301 agaaaaacca caatctacca tttcaggact tcttgggcct actttatatg ctgaagtcgg 
361 agacatcata aaagttcact ttaaaaataa ggcagataag cccttgagca tccatcctca 
421 aggaattagg tacagtaaat tatcagaagg tgcttcttac cttgaccaca cattccctgc 
481 agagaagatg gacgacgctg tggctccagg ccgagaatac acctatgaat ggagtatcag 
541 tgaggacagt ggacccaccc atgatgaccc tccatgcctc acacacatct attactccca 
601 tgaaaatctg atcgaggatt tcaactctgg gctgattggg cccctgctta tctgtaaaaa 
661 agggacccta actgagggtg ggacacagaa gacgtttgac aagcaaatcg tgctactatt 
721 tgctgtgttt gatgaaagca agagctggag ccagtcatca tccctaatgt acacagtcaa 
781 tggatatgtg aatgggacaa tgccagatat aacagtttgt gcccatgacc acatcagctg 
841 gcatctgctg ggaatgagct cggggccaga attattctcc attcatttca acggccaggt 
901 cctggagcag aaccatcata aggtctcagc catcaccctt gtcagtgcta catccactac 
961 cgcaaatatg actgtgggcc cagagggaaa gtggatcata tcttctctca ccccaaaaca 
1021 tttgcaagct gggatgcagg cttacattga cattaaaaac tgcccaaaga aaaccaggaa 
1081 tcttaagaaa ataactcgtg agcagaggcg gcacatgaag aggtgggaat acttcattgc 
1141 tgcagaggaa gtcatttggg actatgcacc tgtaatacca gcgaatatgg acaaaaaata 
1201 caggtctcag catttggata atttctcaaa ccaaattgga aaacattata agaaagttat 
1261 gtacacacag tacgaagatg agtccttcac caaacataca gtgaatccca atatgaaaga 
1321 agatgggatt ttgggtccta ttatcagagc ccaggtcaga gacacactca aaatcgtgtt 
1381 caaaaatatg gccagccgcc cctatagcat ttaccctcat ggagtgacct tctcgcctta 
1441 tgaagatgaa gtcaactctt ctttcacctc aggcaggaac aacaccatga tcagagcagt 
1501 tcaaccaggg gaaacctata cttataagtg gaacatctta gagtttgatg aacccacaga 
1561 aaatgatgcc cagtgcttaa caagaccata ctacagtgac gtggacatca tgagagacat 
1621 cgcctctggg ctaataggac tacttctaat ctgtaagagc agatccctgg acaggcgagg 
1681 aatacagagg gcagcagaca tcgaacagca ggctgtgttt gctgtgtttg atgagaacaa 
1741 aagctggtac cttgaggaca acatcaacaa gttttgtgaa aatcctgatg aggtgaaacg 
1801 tgatgacccc aagttttatg aatcaaacat catgagcact atcaatggct atgtgcctga 
1861 gagcataact actcttggat tctgctttga tgacactgtc cagtggcact tctgtagtgt 
1921 ggggacccag aatgaaattt tgaccatcca cttcactggg cactcattca tctatggaaa 
1981 gaggcatgag gacaccttga ccctcttccc catgcgtgga gaatctgtga cggtcacaat 
2041 ggataatgtt ggaacttgga tgttaacttc catgaattct agtccaagaa gcaaaaagct 
2101 gaggctgaaa ttcagggatg ttaaatgtat cccagatgat gatgaagact catatgagat 
2161 ttttgaacct ccagaatcta cagtcatggc tacacggaaa atgcatgatc gtttagaacc 
2221 tgaagatgaa gagagtgatg ctgactatga ttaccagaac agactggctg cagcattagg 
2281 aattaggtca ttccgaaact catcattgaa ccaggaagaa gaagagttca atcttactgc 
2341 cctagctctg gagaatggca ctgaattcgt ttcttcgaac acagatataa ttgttggttc 
2401 aaattattct tccccaagta atattagtaa gttcactgtc aataaccttg cagaacctca 
2461 gaaagcccct tctcaccaac aagccaccac agctggttcc ccactgagac acctcattgg 
2521 caagaactca gttctcaatt cttccacagc agagcattcc agcccatatt ctgaagaccc 
2581 tatagaggat cctctacagc cagatgtcac agggatacgt ctactttcac ttggtgctgg 
2641 agaattcaga agtcaagaac atgctaagcg taagggaccc aaggtagaaa gagatcaagc 
2701 agcaaagcac aggttctcct ggatgaaatt actagcacat aaagttggga gacacctaag 
2761 ccaagacact ggttctcctt ccggaatgag gccctgggag gaccttccta gccaagacac 
2821 tggttctcct tccagaatga ggccctggga ggaccctcct agtgatctct tactcttaaa 
2881 acaaagtaac tcatctaaga ttttggttgg gagatggcat ttggcttctg agaaaggtag 
2941 ctatgaaata atccaagata ctgatgaaga cacagctgtt aacaattggc tgatcagccc 
3001 ccagaatgcc tcacgtgctt ggggagaaag cacccctctt gccaacaagc ctggaaagca 
3061 gagtggccac ccaaagtttc ctagagttag acataaatct ctacaagtaa gacaggatgg 
3121 aggaaagagt agactgaaga aaagccagtt tctcattaag acacgaaaaa agaaaaaaga 
3181 gaagcacaca caccatgctc ctttatctcc gaggaccttt caccctctaa gaagtgaagc 
3241 ctacaacaca ttttcagaaa gaagacttaa gcattcgttg gtgcttcata aatccaatga 
3301 aacatctctt cccacagacc tcaatcagac attgccctct atggattttg gctggatagc 
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3361 ctcacttcct gaccataatc agaattcctc 
3421 aggtctttat cagacagtgc ccccagagga 
3481 tgatcaaatg cactctactt cagaccccag 
3541 aatgcttgag tatgaccgaa gtcacaagtc 
3601 ttcctcagaa catgaagtct ggcagacagt 
3661 ctctccagaa ctcagccaga caaacctctc 
3721 agaactcatt cagagaaacc tttccccagc 
3781 cagccataca accctttctc cagacctcag 
3841 gacaaacctc tctccagaac tcagtcagac 
3901 cctttctcca gacctcagcc atacaaccat 
3961 tccagaactc agccatatga ctctctctcc 
4021 cctcggtcag atgcccattt ctccagacct 
4081 ccagacaaac ctctctccag aactcagtca 
4141 gcccctttct ccagacccca gccatacaac 
4201 ctctccagaa ctcagtcaga caaacctttc 
4261 agatctcagt caaattcccc ttaccccaga 
4321 tggtgagaca gatctttccc caaactttgg 
4381 ggtgactctc tctccagaca tcagtgacac 
4441 acctcctcca gaccttgatc agatattcta 
4501 tcaagaattt aatgagtctt ttccttatcc 
4561 tcctactctc aatgatactt ttctatcaaa 
4621 cagtaaagat ggtacagatt acattgagat 
4681 agatgactat gctgaaattg attatgtgcc 
4741 gacaaacatc aactcctcca gagatcctga 
4801 caatggaaac agaagaaatt attacattgc 
4861 atttgtacaa agggaaacag atattgaaga 
4921 taagaaagta gtttttcgaa agtacctcga 
4981 ggagtatgaa gagcatctcg gaattcttgg 
5041 tatccaagtt cgttttaaaa atttagcatc 
5101 ttcctatgaa aaatcatcag agggaaagac 
5161 ggaagataat gctgttcagc caaatagcag 
5221 atcagggcca gaaagtcctg gctctgcctg 
5281 cccagaaaaa gatattcact caggcttgat 
5341 actacataag gacagcaaca tgcctgtgga 
5401 ctttgatgaa aagaagagct ggtactatga 
5461 atcctcagaa atgaaaaaat cccatgagtt 
5521 gcctggcctg aaaatgtatg agcaagagtg 
5581 ctcccaagac attcacgtgg ttcactttca 
5641 acagcaccag ttaggggtct ggccccttct 
5701 ggcatcaaaa cctggctggt ggctcctaaa 
5761 gatgcaaacg ccatttctta tcatggacag 
5821 tggtatcata tctgattcac agatcaaggc 
5881 attagcaaga ttaaacaatg gtggatctta 
5941 agaatttgcc tctaaacctt ggatccaggt 
6001 gatccagacc caaggtgcca aacactacct 
6061 agcttacagt tccaaccaga tcaactggca 
6121 gatgtatttt aatggcaatt cagatgcctc 
6181 tattgtggct agatatatta ggatctctcc 
6241 attggaactg caaggttgtg aggtaaatgg 
6301 aaagatagaa aacaagcaaa tcacagcttc 
6361 ctgggaaccc ttccgtgccc gtctgaatgc 
6421 ggcaaacaac aataagcagt ggctagaaat 
6481 aattataaca cagggctgca agtctctgtc 
6541 ccactacagt gagcagggag tggaatggaa 
6601 caagattttt gaaggaaata ctaataccaa 
6661 aatcatttcc aggtttatcc gtgtcattcc 
6721 cctggaactc tttggctgtg atatttacta 
6781 actctttaag acctcaaacc atttagaatg 
6841 aacagttttc cactatttct ctttcttttc 



aaatgacact ggtcaggcaa gctgtcctcc 
acactatcaa acattcccca ttcaagaccc 
tcacagatcc tcttctccag agctcagtga 
cttccccaca gatataagtc aaatgtcccc 
catctctcca gacctcagcc aggtgaccct 
tccagacctc agccacacga ctctctctcc 
cctcggtcag atgcccattt ctccagacct 
ccatacaacc ctttctttag acctcagcca 
aaacctttct ccagccctcg gtcagatgcc 
ttctctagac ttcagccaga caaacctctc 
agaactcagt cagacaaacc tttccccagc 
cagccataca accctttctc tagacttcag 
aacaaacctt tccccagccc tcggtcagat 
cctttctcta gacctcagcc agacaaacct 
cccagacctc agtgagatgc ccctctttgc 
cctcgaccag atgacacttt ctccagacct 
tcagatgtcc ctttccccag acctcagcca 
cacccttctc ccggatctca gccagatatc 
cccttctgaa tctagtcagt cattgcttct 
agaccttggt cagatgccat ctccttcatc 
ggaatttaat ccactggtta tagtgggcct 
cattccaaag gaagaggtcc agagcagtga 
ctatgatgac ccctacaaaa ctgatgttag 
caacattgca gcatggtacc tccgcagcaa 
tgctgaagaa atatcctggg attattcaga 
ctctgatgat attccagaag ataccacata 
cagcactttt accaaacgtg atcctcgagg 
tcctattatc agagctgaag tggatgatgt 
cagaccgtat tctctacatg cccatggact 
ttatgaagat gactctcctg aatggtttaa 
ttatacctac gtatggcatg ccactgagcg 
tcgggcttgg gcctactact cagctgtgaa 
aggtcccctc ctaatctgcc aaaaaggaat 
catgagagaa tttgtcttac tatttatgac 
aaagaagtcc cgaagttctt ggagactcac 
tcacgccatt aatgggatga tctacagctt 
ggtgaggtta cacctgctga acataggcgg 
cggccagacc ttgctggaaa atggcaataa 
gcctggttca tttaaaactc ttgaaatgaa 
cacagaggtt ggagaaaacc agagagcagg 
agactgtagg atgccaatgg gactaagcac 
ttcagagttt ctgggttact gggagcccag 
taatgcttgg agtgtagaaa aacttgcagc 
ggacatgcaa aaggaagtca taatcacagg 
gaagtcctgc tataccacag agttctatgt 
gatcttcaaa gggaacagca caaggaatgt 
tacaataaaa gagaatcagt ttgacccacc 
aactcgagcc tataacagac ctacccttcg 
atgttccaca cccctgggta tggaaaatgg 
ttcgtttaag aaatcttggt ggggagatta 
ccagggacgt gtgaatgcct ggcaagccaa 
tgatctactc aagatcaaga agataacggc 
ctctgaaatg tatgtaaaga gctataccat 
accatacagg ctgaaatcct ccatggtgga 
aggacatgtg aagaactttt tcaacccccc 
taaaacatgg aatcaaagta ttgcacttcg 
gaattgaaca ttcaaaaacc cctggaagag 
ggcaatgtat tttacgctgt gttaaatgtt 
tattagtgaa taaaatttta tac 
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LOCUS HUMLPL 3549 bp mRNA PRI 08-AUG-1995 

DEFINITION Human lipoprotein lipase mRNA, complete cds . 

ACCESSION M15856 

NID gl87209 

KEYWORDS lipoprotein lipase. 

SOURCE Human adipose tissue, cDNA to mRNA, clones LPL [35 , 37 , 46] . 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebra ta; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3549) 

Wion,K.L., Kirchgessner # T.G. , Lusis,A.J., Schotz # M.C. and 



REFERENCE 
AUTHORS 

Lawn,R.M. 
TITLE 
JOURNAL 
MEDLINE 

COMMENT 



FEATURES 

source 



mRNA 



Human lipoprotein lipase complementary DNA sequence 
Science 235 (4796), 1638-1641 (1987) 
87149101 

Draft entry and clean copy sequence for [1] kindly provided by 
R.Lawn, 18-MAY-1987. 

Several mRNAs ended at around position 2416. 
Location/Qualifiers 
1. .3549 

/ organ ism= "Homo sapiens" 
/db_xref= ■ taxon : 9606 9 
/map="8p22 n 
<1. .3549 
/gene="LPL" 

/note=" LPL mRNA (alt.); G00-12O-700 " 
<1. .3154 
/gene="LPL" 

/note= "LPL mRNA (alt.); G00-120-700" 
1..3549 
/gene="LPL" 
175. .255 
/gene^LPL" 

/note=" lipoprotein lipase signal oeptide; G00-120-700" 
175. .1602 
/gene= M LPL" 

/note=° lipoprotein lipase precursor" 
/codon_start=l 
/db_xref= n GDB:GOO-120-700" 
/db_xref= M PID: g307138 " 

I transla t ion= M MESKALL VXTLAVWLQSLTASRGGJVAAADQRRDFI DI ESKFALR 

TPEDTAEDTCHLI PGVAESVATOTFNHSSKTFMVIHGWTVTGMYESWVPKLVAALYKR 

EPDSNVIVVDWLSRAQEHYPVSAGYTKLVGQDVARFINWMEEEFNYPLDN^ 

GAHAAGIAGSLTNKKvTWITGLDPAGPNFEYAEAPSRLSPDDADFVI^vIjHTFTRGSPG 

RSIGIQKPVGHVDIYPNGGTFQPGO^IGEAIRVIAERGLGDVDQLVKCSHERSIHLFI 

DSLLNEENPSKAYRCSSKEAFEKGlXLSCRKNRClJNLGYEINKv^^ 

QMPYKWHYQTOIHFSGTESETHTNQAFEISLYGTVAESENIPFTLPEVSTNKTYSFL 

IYTEVl^IGELLMLKLKWKSDSYFSWSDWWSSPGFAIQKIRVKAGETQKKVIFCSREKV 

SHLQKGKAPAVFVKCHDKSLNKKSG " 



mRNA 
gene 

sig_peptide 
CDS 
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variation 



variation 



variation 



mat_peptide 256. .1599 

/gene= a LPL" 

/note=" lipoprotein lipase; GOO-120-700 " 
1611 

/gene="LPL , « 

/note="g can be a; GOO-120-700" 
/replace^a'' 
2743 

/gene= M LPL" 

/note= B t can be c; G00-120-70O ■ 
/ replace^ "c" 
2851 
/gene="LPL ,, 

/note="a can be g; G00-120-700- 
/replace= M g B 

BASE COUNT 1020 a 739 c 806 g 984 t 

ORIGIN Unreported. 

1 cccctcttcc tcctcctcaa gggaaagctg cccacttcta gctgccctgc catccccttt 
61 aaagggcgac ttgctcagcg ccaaaccgcg gctccagccc tctccagcct ccggctcagc 
121 cggctcatca gtcggtccgc gccttgcagc tcctccagag ggacgcgccc cgagatggag 
181 agcaaagccc tgctcgtgct gactctggcc gtgtggctcc agagtctgac cgcctcccgc 
241 ggaggggtgg ccgccgccga ccaaagaaga gattttatcg acatcgaaag taaatttgcc 
301 ctaaggaccc ctgaagacac agctgaggac acttgccacc tcattcccgg agtagcagag 
361 tccgtggcta cctgtcattt caatcacagc agcaaaacct tcatggtgat ccatggctgg 
421 acggtaacag gaatgtatga gagttgggtg ccaaaacttg tggccgccct gtacaagaga 
481 gaaccagact ccaatgtcat tgtggtggac tggctgtcac gggctcagga gcattaccca 
541 gtgtccgcgg gctacaccaa actggtggga caggatgtgg cccggtttat caactggatg 
601 gaggaggagt ttaactaccc tctggacaat gtccatctct tgggatacag ccttggagcc 
661 catgctgctg gcattgcagg aagtctgacc aataagaaag tcaacagaat tactggcctc 
721 gatccagctg gacctaactt tgagtatgca gaagccccga gtcgtctttc tcctgatgat 
781 gcagattttg tagacgtctt acacacattc accagagggt cccctggtcg aagcattgga 
841 atccagaaac cagttgggca tgttgacatt tacccgaatg gaggtacttt tcagccagga 
901 tgtaacattg gagaagctat ccgcgtgatt gcagagagag gacttggaga tgtggaccag 
961 ctagtgaagt gctcccacga gcgctccatt catctcttca tcgactctct gttgaatgaa 
1021 gaaaatccaa gtaaggccta caggtgcagt tccaaggaag cctttgagaa agggctctgc 
1081 ttgagttgta gaaagaaccg ctgcaacaat ctgggctatg agatcaataa agccagagcc 
aaaagaagca gcaaaatgta cctgaagact cgttctcaga tgccctacaa agtcttccat 
1201 taccaagtaa agattcattt ttctgggact gagagtgaaa cccataccaa tcaggccttt 
1261 gagatttctc tgtatggcac cgtggccgag agtgagaaca tcccattcac tctgcctgaa 
i ^oi gtttccacaa ataagaccta ctccttccta atttacacag aggtagatat tggagaacta 
i aa\ ctcat 3ttga agctcaaatg gaagagtgat tcatacttta gctggtcaga ctggtggagc 
1441 agtcccggct tcgccattca gaagatcaga gtaaaagcag gagagactca gaaaaaggtg 
i atct tctgtt ctagggagaa agtgtctcat ttgcagaaag gaaaggcacc tgcggtattt 
1561 gtgaaatgcc atgacaagtc tctgaataag aagtcaggct gaaactgggc gaatctacag 
1621 aacaaagaac ggcatgtgaa ttctgtgaag aatgaagtgg aggaagtaac ttttacaaaa 
1681 catacccagt gtttggggtg tttcaaaagt ggattttcct gaatattaat cccagcccta 
1741 cccttgttag ttattttagg agacagtctc aagcactaaa aagtggctaa ttcaatttat 
1801 ggggtatagt ggccaaatag cacatcctcc aacgttaaaa gacagtggat catgaaaagt 
1861 gctgttttgt cctttgagaa agaaataatt gtttgagcgc agagtaaaat aaggctcctt 
,11} ca ^99cgt attgggccat agcctataat tggttagaac ctcctatttt aattggaatt 
1981 ctggatcttt cggactgagg ccttctcaaa ctttactcta agtctccaag aatacagaaa 
2041 atgcttttcc gcggcacgaa tcagactcat ctacacagca gtatgaatga tgttttagaa 
tg attccctc ttgctattgg aatgtggtcc agacgtcaac caggaacatg taacttggag 
2161 agggacgaag aaagggtctg ataaacacag aggttttaaa cagtccctac cattggcctg 
$ J Catcatgaca aa 9ttacaaa ttcaaggaga tataaaatct agatcaatta attcttaata 
2281 ggctttatcg tttattgctt aatccctctc tcccccttct tttttgtctc aagattatat 
™ni tataataatg ttctctgggt aggtgttgaa aatgagcctg taatcctcag ctgacacata 
atttgaatgg tgcagaaaaa aaaaagatac cgtaatttta ttattagatt ctccaaatga 
ofoi ttttcatcaa tttaaaatca ttcaatatct gacagttact cttcagtttt aggcttacct 
2521 tggtcatgct tcagttgtac ttccagtgcg tctcttttgt tcctggcttt gacatgaaaa 
2581 gataggtttg agttcaaatt ttgcattgtg tgagcttcta cagattttag acaaggaccg 
VLit tttttactaa gtaaaagggt ggagaggttc ctggggtgga ttcctaagca gtgcttgtaa 
2701 accatcgcgt gcaatgagcc agatggagta ccatgagggt tgttatttgt tgtttttaac 
If aactaatcaa gagtgagtga acaactattt ataaactaga tctcctattt ttcagaatgc 
2821 tcttctacgt ataaatatga aatgataaag atgtcaaata tctcagaggc tatagctggg 
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2881 aacccgactg tgaaagtatg tgatatctga 

2941 gtccttcagc ataattcgga agggaaaaca 

3001 gagtagaaat tgttcctgat gtgccagaac 

3061 gcctataaat agtaggacca atgttgtgat 

3121 ctaaaaataa aatgatgtat gatttgttgt 

3181 ctggatttgg gttgtgaccc agggtgcatt 

3241 gcactgggaa ctctggctcc gaaaaacttt 

3301 cattttattt attagctgta aatacatgtg 

3361 gaaaggtcat tgtggctatc tgcatttata 

3421 tcagtgatgg tctcacagag ccaactcact 

3481 agaaacgtac ttaactgtgt gaagaaatgg 

3541 tattaccac 



acacatacta gaaagctctg catgtgtgtt 
gtcgatcaag ggatgtattg gaacatgtcg 
ttcgaccctt tctctgagag agatgatcgt 
taacatcatc aggcttggaa tgaattctct 
tggcatcccc tttattaatt cattaaattt 
aacttaaaag attcactaaa gcagcacata 
gttatatata tcaaggatgt tctggcttta 
tggatgtgta aatggagctt gtacatattg 
aatgtgtggt gctaactgta tgtgtcttta 
cttatgaaat gggctttaac aaaacaagaa 
aatcagcttt taataaaatt gacaacattt 
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LOCUS 

DEFINITION 

ACCESSION 
NID 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 

REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



misc_feature 
repeat_region 
protein_bind 
repeat_region 
r epeat_r egi on 
repeat_region 
protein_bind 
repeat_region 
repeat_region 
r epea t _r eg i on 
repeat_region 
repeat_region 
repeat_region 
protein_bind 



HUMTHB 26928 bp DNA PRI 14-OCT-1994 

Human prothrombin (F2) gene, complete cds, and Alu and Kpnl 
repeats . 
M17262 M33691 
g558069 

Alu repeat; Kpnl repetitive sequence; liver specific; thrombin. 
Human DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 {bases 6128 to 26928) 
Degen,S.J. and Davie, E.W. 

Nucleotide sequence of the gene for human prothrombin 

Biochemistry 26 (19), 6165-6177 (1987) 

88077877 

2 (bases 1 to 6667) 

Bancroft, J. D. , Schaef er, L. A. and Degen,S.J. 

Characterization of the Alu-rich 5' -flanking region of the human 
prothrombin- encoding gene: identification of a positive cis -acting 
element that regulates liver-specific exoression 
Gene 95 (2), 253-260 (1990) 
91065538 

3 (bases 1 to 26928) 
Degen, S. J. 

Direct Submission 

Submitted (22-SEP-1987) S.J.F. Degen, Division of Basic Science 
Research, Children's Hospital Research Foundation, Cincinnati, OH 
45229-3039, USA 

Location/Qualifiers 

1. .26928 

/organism= "Homo sapiens ■ 
/ db_xr e f = " t axon : 9 6 0 6 ■ 
/ tissue_type= "placenta ■ 
/clone="L[14,25,33,36,81) - 
/ c 1 one_l i b= ° Lambda - 1 0 B 

/map= H llpll-ql2; 24 bp upstream of Ncol site" 
405. .511 

/note="MER sequence" 
563. .838 

/note= B Alu repeat" 
725. .731 

/bound_moiety= " Apl ." 
842. .1136 
/note="Alu repeat" 
1148. .1344 
/note="Alu repeat" 
1814. .2070 
/note="Alu repeat" 
2052. .2059 
/bound_moiety= ■ Apl " 
2577. .2870 
/note="Alu repeat" 
3122.. 3415 
/note="Alu repeat" 
3804. .4087 
/note= B Alu repeat" 
4210.. 4511 
/note="Alu repeat" 
4553. .4793 
/note="Alu repeat" 
4901. .5201 
/note="Alu repeat" 
4957. .4962 
/bound_moiety= ■ Spl " 
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protein_bind 

repeat__region 

protein_bind 

protein_bind 

protein_bind 

protein_bind 

mi sc_ feature 

exon 

sig_peptide 
gene 



5084. .5091 
/bound_moiety= " Apl ■ 
5231. .5443 
/note="Alu repeat" 
5231. .5238 

/bound_moiety=:"EBP 20" 
5711. .5716 
/bound_moiety= ■ Spl ■ 
5723.. 5730 

/bound_moiety="EBP 20" 
6047. .6054 

/bound_moiety= n EBP 20" 
6198. .6237 

/note="MER sequence" 
6544.. 6653 

/note= "prothrombin precursor" 
/ number =1 

join (6575. .6653,7040. .7089). 
/gene="F2" 

join {6575. .6653,7040. .7200, 7860. .7884, 8127 8177 

• 10609 ' 107 °6.. 10842, 13181. .13495, 13820.. 13948, 
i^^' '^159, 15317.. 15484, 15982. .16155, 16698.. 16879, 
26327. .26397,26544. .26687) 
/gene="F2" 

CDS join (6575. .6653,7040. . 7200 , 7860 . . 7884 , 8127 8177 

10504. .10609,10706. .10842, 13181.. 13495, 13820. .13948 

™2" - 14159 ' 15317 -.15484, 15982 . .16155,16698. .16879, 

26327. .26397,26544. .26687) ' 

/gene="F2 " 

/note= "precursor ■ 

/codon_start=l 

/product= "prothrombin" 

/db_xref= u PID:g339641" 

/ translation^ "MAHVRGLQLPGCLALAAIX^SLVHSQHVFIAPQQARSLLQRVR^ 

NTFLEEVRKGNLERECVEETCSYEEAFEALESSTATDVFWAKYTACETARTPra 

CLE(^CAEGLGTOTRGHVNITRSGIECQLWRSRYPHKPEINSTTHPGADLQENFCRNP 

DSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQCVPDRG 

QQYQGRIAVTTHGLPCUVWASAQAKALSKHQDFNSAVQLVENFCRNPDGDEEGVWCYV 

AGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEA 

DCGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCG 

ASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYI^ 

PRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASI^QAGYKGRVT^ 

ETWTAWGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEG 

£S^pf^spfnnrwqmgivswgegot 



mtron 



exon 



mat_peptide 



/note= "prothrombin intron A" 
7040. .7200 
/gene="F2" 
/number =2 

?^i 709 °* - 7200 ' 7860 - -7884,8127. .8177,10504. .10609, 

o«iI" if 15982 " ' 16155 ' 16 698. .16879, 26327 .. 26397, 
^oS44 . . 26684) 

/gene="F2" 

/product= " thrombin" 
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intron 
exon 

intron 
exon 

intron 

repeat_region 
repeat_region 
repeat_region 
repeat__region 
exon 

intron 
exon 

variation 
intron 

repeat_region 
repeat_region 
repeat_region 
exon 

intron 
exon 

intron 
exon 

intron 

repeat_region 
repeat_region 
exon 

intron 
exon 
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7201. .7859 

/not e= "prothrombin intron B " 

7860. .7884 

/gene=*F2" 

/number =3 

7885. .8126 

/note= "prothrombin intron C" 

8127.. 8177 

/gene="F2" 

/number =4 

8178. .10503 

/note= "prothrombin intron D" 
8330. .8675 

/note="Alu repeat copy A" 
9030. .9161 

/note="Alu repeat copy B" 
9176. .9475 

/note="Alu repeat copy C" 
9643. .9937 

/note="Alu repeat copy D M 

10504.. 10609 

/gene= rt F2" 

/numbers 5 

10610.. 10705 

/note= "prothrombin intron E" 

10706. .10842 

/gene="F2 B 

/number =6 

10774 

/genes*^" 

/note="c in DNA; a in cDNA" 
10843. .13180 

/note= "prothrombin intron F" 
10933. .11232 

/note= n Alu repeat copy E" 
12089.. 12390 

/note= B Alu repeat copy F" 
12391.. 12689 

/note="Alu repeat copy G" 

13181.. 13495 

/gene="F2" 

/number =7 

13496. .13819 

/no te= "prothrombin intron G" 

13820. .13948 

/gene="F2" 

/number= 8 

13949. .14032 

/notes "prothrombin intron H" 

14033. .14159 

/gene=**F2" 

/number=9 

14160. .15316 

/notes -prothrombin intron I" 
14325. .14643 

/note=°Alu repeat copy H" 
14820. .15126 

/note= "Alu repeat copy I" 
15317.. 15484 
/gene= "F2 • 
/numbers 10 
15485. .15981 

/note= -prothrombin intron J- 
15982. .16155 
/genes -F2" 
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/number =11 
16156. .16697 

/ no te= "prothrombin intron K M 
16306. .16596 

/note="Alu repeat copy J" 

16698. .16879 

/gene= w F2" 

/number=12 

16880. .26326 

/ no te= "prothrombin intron L (no splice consensus at 
16880) ; putative" 
16952. .17098 

/no te= "potential new repetitive element copy A; putative" 
17145. .17206 

/note= "potential new repetitive element copy B; putative" 
17375. .17614 

/note= B Alu repeat copy K" 
18250. .18531 

/note="Alu repeat copy L" 
18545. .18795 

/note="Alu repeat copy M" 
19231. .19527 

/note="Alu repeat copy N" 
19706. .20012 

/note="Alu repeat copy 0" 
20584. .20815 

/note="Alu repeat copy P" 
21088. .21375 

/note= w Alu repeat copy Q" 
21120. .21290 

/note="KpnI repeat copy A" 
21387. .21539 

/note="Alu repeat copy R" 
21814.. 22110 

/note="Alu repeat copy S w 
22315.. 22434 

/note="Alu repeat copy T" 
22441. .22738 

/note="Alu repeat copy U" 
22748. .22921 

/note= B Alu repeat copy V" 
22922. .23203 

/note="Alu repeat copy W" 
23204. .23496 

/note= n Alu repeat copy X" 
23558. .23876 

/note="Alu repeat copy Y" 
24037. .24363 

/note="KpnI repeat copy B" 
24421. .24720 

/note="Alu repeat copy Z" 
24721. .25015 

/note=°Alu repeat copy AA" 
25112. .25282 

/note=°Alu repeat copy AB" 
25283. .25575 

/note="Alu repeat copy AC" 
25752. .25998 

/note="Alu repeat copy AD" 

26327. .26397 

/gene="F2" 

/number =13 

26398.. 26543 

/ no te= "prothrombin intron M" 



FIG. 28D 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



62/97 



polyA_signal 
repeat_region 



6463 a 



26544. .>26687 
/gene="F2" 

/ no te= "prothrombin precursor" 
/number =14 
26765. .26770 
26881. .26928 

/note="Alu repeat copy AE" 
6624 c 6755 g 7086 t 



BASE COUNT 
ORIGIN 

1 gcgtgagcca ctgcgccctg accacatata atttttatta 
61 tttattccac acctctcctc tcattcactc ctggtaggtc 
121 tatactgaat ttggatgctt cttgctacag ggcaaagacg 
181 ccttttcaca gatgcaagtc aatccaggca gtgtctatag 
241 aagcgagggc tatcaaagct cttctgtcct gatttgcaac 
3 01 aatcttagaa taaaaaatgg gtaccgttca gagaccttta 
361 atgataaaaa gctccatctc tagacgtgtt caggagtggg 
421 agctgcatca acttggacaa gtcacttcgc ttccctgtgc 
481 aatggggata agtatagtac ctacctcata agtcctgcct 
541 ttactaaatt gtaggcctag tccctataat cccagcactt 
601 atcgcttgaa gccaggagtt ccagaccagc ctggccaaca 
661 aaaataaaaa aaaaaaatac ccaagcttgg tggtgcaggc 
721 gagtctgagt caggaggatt gcttgagccc aggagttcaa 
781 gcaccactgc actccagcct ggcgacagag catgaccctg 
841 ggccaggcac agtggttcat gcctgtaatt ccaacatttt 
901 tcactgtgag ctcagcagtt cgagaccagc ctgggcaaca 
961 taaaattaca aaaattagcc aggagaggtg gtacacgcct 
1021 agctgaagca ggagaattgc ttgaacccgg gaggcgaagg 
1081 gccattgcac tgcagcctag gagacagagc gagactcgat 
lini attaattaat aaaaaaataa gttgggcatg gtggcacctg 
1201 ggaggctaga ggtgggagga tcacttgagc caggagttct 
1261 cacgccacca tactccagcc tgctgtatgt actccagcct 
i^on ?5 caaa9t aaa 9taaaat aaaaattaaa aaacaaatta 
ttgt ^ a 5 ca ^ tcttcctaaa taggaggaca ggcaaaatta 
1441 caggtatagt agtttggggc aggccagcat cacccgcaca 
1501 cgtgttctct gggtcaactt tatggcccag tgaggccgta 
1561 acaagggttg ggagaggcaa aagtgctggt ctgaagcagg 
1621 ctctaccacc aattctgtat gaccgtgccc cctccatttc 
1681 atggggcagt tggatgaaat caatgattcc cagtcttggc 
1741 taacttcttt ttttctctta tggatcccat atttttaaag 
1801 gacttatact tttccaagct ggagtgtggt ggcatgattt 
1861 ctcccgggtt caagtgattc tcctgcctca gcctcctgag 
1921 caccaggccc ggctaatttt tttgtatttt tagtagagac 
1981 aggctgattt caaactcctg acctcaagtg atctgctcac 
2041 ggattacagg cgtgagtcac tatgcccagc cgcttactca 
2101 aaactgctta agtcactgtc tgcagaagag caaaaaaaaa 
2161 aactgctgat cagattgaga aaaacataag attattcacc 
ooqi a ?5 Cgaaagg S aa *aaaatt catttttgtc ttaataaggc 
Ha, tttaacaaa a tatatgcaga aagacaaggc caccccgtag 
2J41 cttggaaatg gctggattta ataatatctg gtctttcttt 
2401 actatgtctt ggaacataat tttactgttt tcagtggtta 
2461 tagcattggt ctttacccat gattttgttt gacgccaact 
2521 ctgccccccg ctttgttatg gccttgctcc tatagggcaa 
2b8i ggtgtggtgg ctcaggcctg taatcccagc actttgaggg 
5 5 gaggtcagg ^tttgagac cagcctggcc agtatggtga 
2701 taacaaaaat tagctgggtg tggtggcaca cacctgtaat 
lilt aaacaa 5 a ga accacttgaa cccaggaggc ggaggttgcg 
odqi 5 g CtCcag cct 99gaaac agagcaagat tccgtctcac 
oS tctgctttaa gtatgcaggc cgtgtttgtg ctgaacggca 
tnV: at 9S taccaa ctagggacct cagagttcca aggagaacaa 
3001 gggggcttgt atcagaccct gaagactaag catgtgctgg 
3061 catggtagtg cactaaacac ctaacctata tttaagtgtt 
cttttttt tt tgggagtcaa gagtcttgct ctgttgccca 
atctcagctc actgcagcct ccgcctcccg ggttcaagct 
3241 caaatagctg agactatagg cacgcacatc catgcccagc 



attataatgt 
atttttaatg 
ctaataagat 
ctgctgaacc 
tttagtagtg 
gagattgcaa 
ttggggcttt 
ctcagtttcc 
acctagcaca 
ttggagaaca 
tagtgagact 
ctgtagtccc 
ggttgtagta 
tctctaaaaa 
gggaggccaa 
aggcaaaatc 
gtaatcccag 
ttgcagtgag 
ctcaataaat 
cctgtagtcc 
aggctgcagt 
gggcaacaga 
ctaaattgta 
agggacttaa 
gtagttctgt 
ctctaccaga 
agtctgggtt 
ctccatgacc 
tctatcatgg 
atttttacta 
cagctcactg 
tagctgggat 
agaatttcac 
ctcagcctcc 
cattttctag 
aaaagaaata 
acctaaagag 
aaattcacaa 
aacgtgcaca 
gagccctgaa 
tagagatttg 
tgttggcagg 
gaatatctgc 
gccaaggcgg 
aatcctgtct 
cccagctatt 
gtgagccgag 
acacaaaaaa 
ggaatgccaa 
acagttggtt 
gtccattgtt 
tttgtttgtc 
ggctggagtg 
attctcctgt 
taattttttt 



tgaaagtccc 
atttgatgta 
tttgctggag 
caaaatcaga 
caagaaaaaa 
ggcatcacag 
gaccttgact 
tcatccataa 
tggtgagcaa 
aggtagggga 
gtgtttctat 
ggctacttgg 
agctatgatt 
tataaaatta 
ggcaggtgga 
ctgtctctac 
ttactgggga 
ccaagatcgt 
aaataaatta 
aagctactca 
gagctattat 
gtgacaccct 
cttaacagta 
catgtgccct 
actgtaggtg 
atgtcagggg 
tccatcctag 
acatagagac 
aaccatttgc 
aatagaaatt 
caacctccgc 
tataggtgct 
catgttggcc 
caaagtgctg 
tcaaaataga 
aaaaattgaa 
aaaaaatttc 
tttttgaggt 
cagccctagg 
attctctaac 
ctttacaatt 
aatgcacccc 
tttaaggccg 
gcagatcacc 
ctactaaaaa 
tgggaggccg 
attatgccac 
tatatatatg 
acttggctgc 
cctggaggct 
gtcctgcacc 
caaaaaatgt 
cagtgacacg 
ctcagcctcc 
atttttagta 
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lit, 9agacgaggt 9 tctccat 99 tggccaggtt ggtcttgaac tcctgtcctc aagtgatcca 
AH cctcccaaag tggtgggatt gcaggcatga gacaccgcgc ccggcctgcc 

348^ a ttaaaatgag ttgtccattt gtaagctgct gatttctttg ggacattgtc 

All "tcataaag catcagtgat ttcaccattc ttccacccaa gcttcaccgt 

3603 ^^^ g " g " t9ttcttg cttcaa "tc agcagaattc atttagctct gataagggct 
lt°f} =f" tcaaa = tgatgtctta tccttcttag tgcctcaaac tacatcctgt tcactcatgt 
\V£ ^ a ?" agt ta 9tgtgagt ttattttggt gcacaaaaat ttttttaaat ccatgcagtc 
Vill "ttttcata atacgcattt tccatgaact tttcgaagac cccttgtaga tgtctgttgt 
llll ^ aaaCCaCC cagtttaca 9 taattttttt ttttttttga gatgalgtct tgctc?gtcg 
All a ? tscatt 99 cacactctcg gctcactgca accEctgcct cctgggEtc! 

3961 ctgtctcagt cccccgagta gctgggatta caggtgtgtg ccaccatgcc 

4021 a ac cc a r^"^ agtagagacg Sggtttcact atgttggcta ggctggtctc 
llll ™ t Jt cctt 9 t 9 atc gscccgcctc ggcctcccaa agtattggga ttacaggcgt 
llll llll^tm r^" CCt acagtaattt tatagcagcc taggctlaga tagccl?t?c 
4141 tgggtataag aatgtcatat actgaacagg cctgcaactg tgagtaaaag tctgcaaaga 
4201 ggccgggcag tggctcatac ctgtaatccc agcactttgg ggggccgagg caggtggatc 
1!S aaatac 99 ^ agCag " cga gaccagcctg accaacatgg ?gILccc« tcfctlctaa 
AH aaa =^ a ttagctgggc gtggtagtgc atgcttgtaa tccctagcat gcacttggga 
till III I 9 " 3 a 99 c t3aggc aggagaatca cttgtactca ggaggccgag gttgcagtga 
Ail ZlT.llltl ^gccactgca ctcctttctg ggtgacagag tgagactcca ?ctcaaaaaa 
4501 acaaaacaaa acaaaacaaa aacaaacaaa aaaacccaac aggtaggtag cagtggttca 
4621 a?™ aat CCCcacttt 9 aaggctaaag tgggcagatc acctgfggtc aggagftcac 
till ? MT^rZ 9 sgcaacat 99 tgaaactctg tctctacaaa aatacaaaaa ttagccaggc 
4741 cc?™™? ? tgct ?" gt tccagctatt cgggaggctg aggcaggaga atcgcttgla 
4801 a aga 9 gtt 9 ca gtgagccgag ttcacgctat tgcactccag cctScatStc 

4861 tttattttat aaatatatat tataatttta ttttatttat tcaattttat 

tttatttttc taggaacagg tctcattcag gccaggcatg gtgctcacgc 
llll tt™^? ag T Cttggg ^ccgaggt ggaggtgggc ggatcLctg Iggtcagglg 
soli S cct 99 tcaa t gtggcgaaac cccatctcta ctaaaaatac alaaat?ag?- 
5101 cS™ ^ a ^ C9CC ^ taattcca gctacttggg atactgagtc aggagaatla 
llll S^?™" gagat ? gaaa ttgcagtgag ccgagattgt tccactgcac tccagcctgg 
llll 9990 gagactcc 9 t ctcaaaaaaa aaaaaaaaaa agaaagaaag aaagaaagaa 

till a?™ 3 ? 93 tcttactctg "acccaggc tggagtacag tggtgcaatl atagctcac* 
•III 9 ca gg ca tgc accaccattc ccagctaatt tttaattttt tttggtagag atgagggtct 
5341 tgctatgttg cccaggctgg tctcaaactc ctggcctcaa gcgi?cc?g 9 ca?g?Iggcc 
till ^ aaa9 ^ 9 ttgggattac aa gtgtgagc cactatgcct igcctaaala tatltllltg 
5521 at a tt a r a S a agaaatgggc ^tcccaggaa ttaaggtgtt tgcgggagtc ctggtcccca 
llll caacactccc Cgttcccaca catgacctgg tccagacccc aaacagccag 

5641 T g9 l gaggC ga 99 c 9 a 9 aa cttgtgcctc cccgtgttcc tgctctttgt 

5641 ccctctgtcc tacttagact aatatttgcc ttgggtactg caaacaggaa atgggggagg 
=701 gacaggagta gggcggaggg tagggtagga ccagaagccE ctctaggcct gccl?gggg? 
5761 aggcagccag ggagaaggag ggcccctcag tggagaccca gggat£?cag Lglclc^gt 
llll tao 999 ^ 9 ? C9 ^ a9gtcc tgggaggtga cagaagatag altaaaggcc calgagtclc 
5881 tggacctgac tcctcccagc agctgccaca cacaaacaca cctccaggca ccctggacag 
lltl ? aa ?! a ^ 9a ? aaatgggccc ctcctccagt ggctgagaag ctggggcaaa tgttggctg? 
toll ggtgcatccc atggcgaggg gcaacttcca tcaggccaca cSttttatec 

6061 ttgtctctat ttttgatatc tgtgtattat gattatacaa acccccacat tggcctatat 
6181 9 attaa 9 aa c ttacgatatt ccatggacat tccattccta atctccttta 

624^ L™^ caaa 9 ta tta ttcccattgt atagatgagg aaactgaggc acacagagat 
6241 gacaagcaac caccgctata tgttaggatt cgaaggagct ccaggaaagt ctcatagccc 

till c?tt??tata atctcagagg mntm gaging tgacag?glc 

cit, ctttttt g t g actcctccta gaccatccat ccctgctccc aggaggacct otcctcccaa 
All a o? 9t " a ? a tggacaggag gactatctac ccaclcgtcc cclcggccct gaccctctgl 
All ofof a ^ Ct ^ tccgctgatt tcttcatgtt agttcaacat taccclgagg ggtcaggaca 
till S cagtgaccca ggagctgaca cactatggcg cacgtclglg glttgwgct 
6661 9 ata??rn?f f ggccct " "gccctgtg tagccttgtg cacagccagc atggHaaggg 
6661 agtgcttgca ggctggaaca ggctggagga ctggggtgtg ggcccatggg ctggggtctc 
6721 ctggctggac agagcacaca gagctggccc ctaagtaggE Ilcagcclcl gglgglcagc 
684? taaa« 9aa f a a ^ cag9a9c "agggctgg aaagagaatg gccgcttctc SttSSS 
6901 Ultrrttnn m 9959 ^* ^ec*^ taggaggggc acagggggcc acatttagca 
6961 " ttccacca gcccagactg cctctctcag aagccagcag gggagggtgg 

6961 gcttgcttca tgcccccaga tggccaagac tgcctgttcc tgaggtcgct gttccatg 99 
llll « e C S "" taCa9t gtt ^tggct cctcagcaag cScSSSct gctccaglgg 
7081 gcccggcgag ccaacacctt cttggaggag gtgcgcaagg gcaacctaga gcgagagtgc 
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gtggaggaga 
gtgagcctgg 
ccacagagaa 
tctgagcccc 
tgtgcctgga 
acatccacct 
cacccttgag 
gttcctggag 
ccctctcctc 
cctatcctca 
ccccagagcc 
tactaaaaca 
atgtgttctg 
cctggggacc 
cccatccacc 
tttctcacca 
aaggtaaaca 
tgctgcatgt 
gcttgggaga 
ccagcatctg 
gtctggcatg 
agacagagtc 
aaactccgcc 
caggtgctcg 
atgttggcca 
tgctgggatt 
tctcctttgc 
tccccttcta 
acaccaagga 
tcaatttcct 
cttcaaatgt 
tgaaggcacc 
cccttttttg 
cccaaaggag 
tttttttttt 
tctcggctcg 
agtagctggg 
gagacggggt 
ccagcctcag 
ttgttaaatt 
atcaggtgct 
tctcaaaaga 
agcctgtaat 
atcttggcca 
gtggtggcac 
tctgggaagc 
cagatcaaga 
gttctgttca 
gagatggctt 
aacaagtggg 
ctgagctagc 
tcctgctggc 
ctattggttg 
atatttggag 
gggtgaatgc 
tttgcaggga 
caggtaactg 
caggcattga 
gtcggccttc 
cctgttgggc 
aggagaattt 
accccaccgt 
ggcgacccat 
gaatactggc 



cgtgcagcta 
gctgctcgga 
gcaagcgagg 
tcctcacagg 
ctgtgtccct 
gccttggagc 
gggtgggact 
ctctgtgtcg 
ccatctcccc 
aatattcttc 
tgccccctgc 
acacaaaaca 
ggccaagtac 
ccagtgagag 
ccttccccac 
acatcccatc 
cctgggtctt 
ctggaaggtg 
agagctcagg 
acattgctcc 
cgaacgaatg 
tcgctctgtt 
tcccggattc 
ccaccacgcc 
ggctggtctt 
atagaagtga 
ctgacccgag 
gctttaggtg 
cttccctaag 
ctggagcgac 
caccactttt 
tttagcacga 
ccaggctgat 
tggattataa 
tttgaggtgg 
ctgcaacctc 
attacaggca 
ttcaccatgt 
cctcccagag 
acgtactcaa 

ggggatttta 
gaacaaaaat 
cctagcactt 
acatagtgaa 
gcgcctgtaa 
gaaggttgca 
ctcatctcaa 
aggcaccaag 
cccaggagag 
cattccagtc 
agggacaggt 
caagagaagc 
atgtgtactg 
gccactgttg 
aggttcagga 
gagaggaaat 
tgctgagggt 
gtgccagcta 
ccaccatggg 
aatttcctct 
ctgccgcaac 
gaggaggcag 
gaccaagccc 
tacccaggca 



cgaggaggcc 
cggtgccggg 
aacgccacag 
gctggcaaga 
gtgcagctcc 
tctgtgtcca 
ctggggaggc 
cctttcctgt 
cagcctcttt 
tccttttgga 
gtgaccaggg 
ggagctgccg 
acaggtgagc 
aattctaccc 
tccttccttg 
caccctgact 
ttccagcttg 
agcaactgac 
ggtgggtttg 
cattcctggg 
aatgaatgaa 
gcccaggctg 
aagcaattct 
tagctaattt 
gaactcctga 
gccaccgcgc 
tctctgcccc 
tcactgaacc 
catgccaagg 
catcacatct 
gctgagactt 
caaaaatgga 
cttgaactct 
gtgtgagcca 
agtctcactc 
cgcctggcgg 
cacactacca 
tggccaggct 
tgctgggatt 
cagacatttt 
agagaatcaa 
acaaaatatt 
tgggagctga 
acccccaacc 
tcctagctac 
gtgagctgag 
atacataata 
agaccacagg 
gcagagttct 
agaagaaaca 
agaccagggc 
cacagaatgg 
agcacccgac 
agtgaatggg 
ttgtggacct 
aagtccccag 
ctgggtacga 
tggaggagtc 
ctgagaacag 
tccagaatca 
cccgacagca 
gaatgcagca 
gggggcttca 

cagtggctca 



ttcgaggctc 
gcctcagacc 
ccccttcgct 
ggagcggcct 
atgacatggg 
catggcctcc 
caccacaagc 
ctgtagggac 
cagactcggt 
aacaaaagta 
taaggaaagt 
tagcctcact 
accgggaagg 
agagaatctt 
gtccctccca 
ccagctcatc 
tgagacagcg 
acgggtttgg 
gagtgtggct 
gtcaagatgt 
tggactaatg 
gagtgcagtg 
ctgcctcaac 
ttgtattttt 
cctcgtgatc 
ctggccatga 
cacctagtca 
aaacaggaac 
tgtttctagc 
actgaacact 
cagggagcac 
actctttgtt 
tgggctcagg 
ccatgcctgg 
tgtcgcctag 
ttcaagtgat 
tgcccagcta 
ggtcacaaac 
acaggtgtaa 
acaaagttcc 
atacagtctc 
aaaatgattg 
ggtgggcgcc 
tctactaaaa 
tagggaggct 
atcatgccac 
aataataatt 
aaaatgagtg 
gcctggccta 
atccgtggaa 
cagttgaaaa 
aagctccatg 
agtgcctgtc 
agaactgctg 
gcatgagctg 
gctccaaggc 
actaccgagg 
gctacccaca 
ggagcaagcg 
actccactac 
gcaccacggg 
tccctgtctg 
tggggcctgg 
tgcccgtaat 



tggagtcctc 
gggcccaact 
gctcacagcc 
cagcctttcc 
gaggcctcca 
tcagcggcag 
ccccgggctc 
tctgccaggg 
gtgtgtgttg 
ggaaactctg 
gtgaggagga 
cccagccctt 
atttgcccca 
ctgctgcacc 
tctgttcatc 
ctggccatac 
aggacgcctc 
ggagcaggac 
ggtggaggcc 
ctctttgtac 
aattaatgtt 
gcacgatctt 
ctcccaagta 
agtagagacg 
cacccacctc 
attcatgttt 
gagctttgat 
ccaaaccccc 
acctggcctt 
ttcctatcct 
cctccctcct 
tatttataag 
caattctccc 
ctgccatact 
gctggagtgc 
tctcctgcct 
attttttgta 
tcctgacctc 
tccactgcgc 
tgctacgtgc 
tgccttcaag 
cggccgggtg 
caggccaggg 
atacaaaaat 
gatggggaga 
tgcaccttca 
caataagtga 
tctggtttgc 
gtgggatgca 
agacccagag 
ggaccttcat 
agggcagggc 
atatggtagg 
gttgcagagg 
ggaggtgggg 
tgaccggggt 
gcatgtgaac 
taagcctgag 
tacctcaagt 
ccatcctggg 
accctggtgc 
tggtaggctg 
cagcctggga 
cccagcactt 



cacggctacg 
ctagacactt 
tcatttcaac 
tgggggtctc 

cagtcttcag 
actcccacac 
aagactcagt 
acccactgcc 
gaggaactcc 
ccacaaacct 
gcataacatt 
gtttttcagg 
ggaagggagg 
tagccatcca 
catctttctg 
cccaatccca 
gagataagct 
atggagggga 
gaggcagtcc 
ctggctctgt 
tttttttttg 
ggctcactgt 
gctgggatta 
gggtttcacc 
ggcctcaaag 
aaggcttcat 
gatgtcacat 
agctgctctg 
gcatatgttg 
tcaaggactg 
gcactgtgtc 
agcagggtct 
atctcagtct 
ttcatttttt 
agtggcgcga 
tagcctcctg 
ttttttagta 
aggtgatcca 
ccagcctcat 
caggcactat 
gaattcaaaa 
tggtggctca 
gtttgagacc 
tagctggggt 
atttcttgaa 
gtctcggcat 
ttgcaagaaa 
cagaaaatga 
tggatgaaca 
gcatgagaag 
cactttttca 
tgtgactgtc 
cacttagcga 
aagaggggct 
gatagacaac 
ggggtctccg 
atcacccggt 
tgagtgaggg 
tcaacagcct 
gccgacctac 
tacactacag 
ggggcagtgg 
tgggaaccaa 
tgggaggctg 



FIG. 28G 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



65/97 



10981 
11041 
11101 
11161 
11221 
11281 
11341 
11401 
11461 
11521 
11581 
11641 
11701 
11761 
11821 
11881 
11941 
12001 
12061 
12121 
12181 
12241 
12301 
12361 
12421 
12481 
12541 
12601 
12661 
12721 
12781 
12841 
12901 
12961 
13021 
13081 
13141 
13201 
13261 
13321 
13381 
13441 
13501 
13561 
13621 
13681 
13741 
13801 
13861 
13921 
13981 
14041 
14101 
14161 
14221 
14281 
14341 
14401 
14461 
14521 
14581 
14641 
14701 
14761 



aggcaggcag 
ccgtctctac 
tactctggag 
tgagatcctg 
aaaaaaagat 
attgctcagt 
tccccctccc 
cctttctgat 
accctgttct 
tattatttcc 
aagcatttct 
gggtcaactg 
tgagtctggc 
gctgaggctg 
acaatgaaca 
cctgcccacg 
ctacacctca 
tttgtgccga 
gggcccattg 
actcccttgc 
ctcctgggtt 
tgccaccact 
ccaggctgat 
agattacagg 
agtctcactc 
cctcctgggt 
cgccaccacg 
ttgcctgact 
gattacaggt 
caacaaaaac 
aaataagttg 
aagcgtcctg 
ccagttggtt 
gccagacact 
tagtaacccc 
cacaggcaga 
tctcactcac 
gcgatgactc 
gtccctgatc 
tgcctggcct 
gctgtgcagc 
tgctatgtgg 
ctgcctgggt 
ttctgcttat 
tggggggtaa 
tgcctagtag 
cgtaggggaa 
ggtgcctggg 
gactcagaca 
ccgaggacct 
ctgcggggct 
ctgcgacctc 
tcctacatcg 
tgtgtcctgg 
accctcaggc 
tcccaccaac 
tcttagatgg 
ctacctcgat 
cagcctcctg 
tttaagtaga 
gtgatccacc 
ggcccatggg 
tcacctccgc 
cgctatgtga 



atcacctgag 
taaaaataca 
gctgaggcac 
ccactgtact 
gctggccacc 
aaagtcaggg 
cactcttgac 
caaggcactc 
gagtatcaca 
ttggtgtgtt 
ttttcccatg 
acggaggttg 
tcctcgttga 
aaagaaggta 
gaaaagcgtc 
ttccattggc 
gcctgtaaaa 
ttaccagtcc 
ctcattcctg 
ccaggctgtt 
caagcgattc 
cctggctaat 
ctcaaactcc 
ggtgaggcac 
tcacccaggc 
tcaggcgatt 
ccttgctaat 
tgaactcctt 
gtaagccact 
agctactatt 
caggcttgca 
ctgtgcagct 
acagatggcc 
tatgctgtat 
tttaaaggca 
aagcagcaag 
tctgctgcct 
cacgctccga 

gggggcagca 
gggccagcgc 
tggtggagaa 
ccgggaagcc 

agggggcctg 

cgaacgctta 
ggtcctgtgc 
cccaactgtg 
ctggggggat 
tcccaacaga 
gggccatcga 
ttggctcggg 
ggtggccagg 
tgttcgagaa 
acgggcgcat 
agccctgcgc 
cctgcctgca 
tccacacagc 
agtcttgctc 
ctcagctcac 
agtagctgaa 
gacagggttt 
tgcctcagcc 
tcctttactt 
gcacagctaa 
actctggact 



gtcaggggtt 
aaaattgcca 
gagaatcgct 
tcagcctagg 
ttcagagctg 
aatcagggga 
ttccttatgg 
ctccctccgg 
gagcaagcct 
aagtagctat 
agggttggca 
gccctggctg 
gggttgggcc 
cctgggaaaa 
ttctgtcctg 
cagagcaagt 
tcacagagca 
acaaacatgc 
gggttggtct 

ggagtgcagt 
ccctgcttca 
ttttttttat 
tgaccttgtg 
tgcgcccagc 
tggagtgcag 
ctctgcctca 
tttgtatttt 
gttccggtga 
gcgcctggcc 
tactccccaa 
gaaattggcc 
ataaaaacat 
taggaggcca 
attttgttta 
aacggtcaga 
accggggttc 
ccttgcccct 
aggctccagt 
gtaccagggg 
acaggccaag 
cttctgccgc 
tggcgacttt 
agttgcaggg 
cctcattgag 
ccatttcaca 
catgcacgct 
ctaggggatg 
ggaggccgtg 
agggcgtacc 
agaggcaggt 
acttgcccct 
gaagtcgctg 
tgtggagggc 
taccattcac 
ggcctgggct 
agccacatga 
tgtcacctag 
tgcaacttct 
tttacagaca 
caccatgttg 
tcccaaagtg 
ctaagcagat 
tgggtttgaa 
ggaaggacct 



tgagaccagc 
ggcgtggtgg 
tgaacccggg 
cgacaagagc 
gcgtcagtca 
tctgagtggg 
tctaggctgt 
gaagccctcc 
tgtgcagttt 
agccacccct 
ggtgtggctg 
ggtggctctg 
tagatctgct 
ctcttcttat 
aaggcctggc 
atatgttcaa 
agggatgtgg 
gttagtgttt 
tttttttttt 
ggccctatct 
gcctcctgag 
gttagtagag 
atcctcccgc 
catttttttt 
tggcataatc 
gcctctcata 
tagtagagac 
tctgcccagc 
cctggtattg 
cccccataca 
catccaggtg 
gactcctcca 
aacctggtta 
atcctctcaa 



agcccagaga 
acacccctgt 
cacccaccag 
gtgaatctgt 
cgcctggcgg 
gccctgagca 
aacccagacg 
gggtactgcg 
acaaatccta 
tgcgctcatt 
gataagtaca 
taacctctgc 
ggtgaggaat 
gaggaggaga 
gccaccagtg 
gaggtagtgg 
cactgcttgg 
gaggacaaaa 
tcggatgcag 
tcctgggggc 
ttacagatga 
gatgggttgt 
gctggagtgc 
gccttccggg 
tgcgccacca 
gccaggctgg 
ccgggattac 
ggtaaagctg 
tccagttctt 
agttaggggg 



tgggccaaca 
tgggcgcctg 
aggcggagtt 
aaaactctgt 
ttcagatcat 
gggatctgcc 
ggctcattcc 
ctagccattt 
ggcccgcggg 
tccctgaggc 
cactcgctaa 
attcaaataa 
ccacgtgcgt 
gctgatgaca 
tcagaacagg 
ggccagggtc 
atgcaggcag 
gttctctagg 
tctttctaag 
cagctcactg 
tagctaggat 
acggggtttc 
ctcggccccc 
tttttttttt 
ttggctcact 
tagctgggat 

ggggtttctt 

tcggcctccc 
gtcttatagc 
cacgcacaca 
aacagcctag 
gcagctccag 
ctatctctgg 
caaacctgca 
ggttaagtaa 
ctgttccggt 
gccaggatca 
cacctccatt 
tgaccacaca 
agcaccagga 
gggatgagga 
acctcaacta 
gtgggaataa 
acagccttac 
ctgaggcccc 
accaaatggc 
ggcccagccc 
caggagatgg 
agtaccagac 
gcatccgagg 
cttgctctgc 
ccgaaagaga 
agatcggcat 
aggtgtgctg 
caacagctga 
ttacttcttt 
agtgctgcaa 
ttcaaacgat 
cacccggcta 
tcttgaactc 
aggcatgagc 
agactgacgg 
ctgattccag 
tgcaaaaagc 



tggcaaaacc 
taatcccaac 
tgcagtgagc 
ctcaaagaaa 
atctgtgcct 
agcctcctcc 
aaacatgcct 
cagtccacac 
attctgtcat 
agaccacaat 
tgcgtctgta 
tgggtccagc 
tcatgctggg 
gacacagaaa 
cacagtcagc 
aagaggtaaa 
gggtaaagaa 
caaccctgtc 
aaggagtctc 
caacctccgc 
tacaggcgtg 
accatgttgg 
caaactgctg 
tttgagatgg 
gcaacctcca 
tacaggcaca 
catgttggcc 
aaagttctgg 
aagtttatcc 
cattgatgat 
tgatccgagc 
gcagccacta 
tttattatgt 
aaagtggcat 
cctgaggtca 
ccatgtgtgg 
agtcactgta 
ggagcagtgt 
tgggctcccc 
cttcaactca 
gggcgtgtgg 
ttgtggtgag 
caacagccgc 
agtaaccagg 
aggaggttat 
ctccaaggcc 
agtcccggcc 
gctggatgag 
tttcttcaat 
ggatgcgggg 
agactgtggg 
gctcctggaa 
gtcaccttgg 
ctggaccccc 
gcatccagga 
tttttttgtt 
tctcggctca 
tctcttgcct 
atttttgtat 
ctgacctcaa 
caccacaccc 
agctggtggc 
agctgtgcta 
aggaggcagg 



FIG. 28H 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



66/97 



14821 

14881 

14941 

15001 

15061 

15121 

15181 

15241 

15301 

15361 

15421 

15481 

15541 

15601 

15661 

15721 

15781 

15841 

15901 

15961 

16021 

16081 

16141 

16201 

16261 

16321 

16381 

16441 

16501 

16561 

16621 

16681 

16741 

16801 

16861 

16921 

16981 

17041 

17101 

17161 

17221 

17281 

17341 

17401 

17461 

17521 

17581 

17641 

17701 

17761 

17821 

17881 

17941 

18001 

18061 

18121 

18181 

18241 

18301 

18361 

18421 

18481 

18541 

18601 



tcaggtgcag 

acttgagggc 

aaatgcaaaa 

tgaggcggga 

attgcactcc 

aataaaaagt 

agcccaagct 

gattgttact 

ctgctgcccc 

gggccagcct 

cctgggacaa 

caaggtacag 

catgaggggc 

acatcccagc 

cggggagaat 

tctgaaaata 

aggctgtctg 

tgccatggca 

agaaccccaa 

tctttcttgg 

gatctacatc 

gaagctgaag 

ggagacggca 

tgggcctggc 

gtttcttgga 

ttatttattt 

cttggctcac 

gtagctggga 

tcggccaggt 

gtgccgagac 

tctctaagaa 

ttcttccttc 

aacctgaagg 

gtgaacctgc 

gacaacatgt 

agctgggaga 

cctcttggtg 

gttatgggag 

agtgagctca 

agcaaatatt 

acccagacgg 

aacagatcta 

gagaacggct 

tgaggtcagg 

tacaaaaatt 

ggcaggagaa 

ttgcactcca 

aggtggccag 

gaagagggga 

catggcaggc 

gaggcctggg 

ggacagccac 

ggggcttgag 

ttagtgtaat 
tgtcatccag 
tagtttgaga 
aaaaaaaaaa 
ccggaaattt 
tcggctcact 
gtagctggaa 
acagggtttc 
cctcggcctc 
gaaatttttt 
gcatggtctc 



tggctcaccc 

aggagttcga 

attagccagg 

ggatcgcctg 

agcctgggtg 

gtgaggcagc 

tggatctggg 

tctagggctg 

tcccaggcag 

catcagtgac 

gaacttcacc 

aactggtggc 

cttggtggct 

agtctctgct 

ccgtctgtct 

gagtctgtct 

actccaaagc. 

ggaaccagcc 

gggcaggcag 

ggtctctgca 

caccccaggt 

aagcctgttg 

gccaggtggg 

tctgatacca 

gtgaacccaa 

actgacggag 

tgcaacccca 

ttacaggcta 

tggtctcgaa 

cacaggcgtg 

atggcgttgg 

cccaaagctt 

agacgtggac 

ccattgtgga 

tctgtgctgg 

actgagttgt 

gctcagtttc 

ggttaaatga 

gatagcagca 

tattgagcgc 

acaatgtctg 

aaacagcaat 

gatgaagtgg 

agttcaagac 

agctggtcat 

ttgcttgagc 

gctgggcaac 

agaaggttgg 

aggaaggagt 

actaaggccc 

gggctgagga 

ttcctttagg 

gcaggttaag 

ttcagactca 

attgtcctcc 

gcaaatcatg 

aaaataccca 

ttttttgaca 

gcaacctcct 

tcacaggcat 

accatgttgg 

ccaaagtgct 

tttttttttg 

ggcttacttg 



ctgtaatccc 

ggccagcttg 

tgtagcagca 

agcccaagag 

acaagagtga 

ccctcagcat 

ccccggaggc 

gtgtagaggc 

gtgatgcttt 

cgctgggtcc 

gagaatgacc 

ccgtgggtgt 

ccgggacaca 

ggaaagccat 

ctggtccctc 

ggactagggc 

cctgcacggc 

ctatcccctc 

tttcctgctc 

ggtacgagcg 

acaactggcg 

ccttcagtga 

ccaccagatg 

agtagccttg 

aagttctttt 

ttccactctt 

cctcctgggt 

atttttgtat 

cccctgacct 

aacgtctgtg 

ggccaggcgg 

gctccaggct 

agccaacgtt 

gcggccggtc 

caagtctgtg 

gcctgggttc 

ttcctctgta 

agtagtatat 

agaggctgcg 

ctatcacgtt 

tgccctcaga 

ccctgaccag 

gcttctaaat 

cagcctggcc 

ggtgacgcat 

cagggaggcg 

acagcaagac 

agaaggcctc 

gagcaggcat 

tgaggtggga 

ggggcagcag 

gcctggaagg 

aaatgatgtg 

cagaaaagtt 

attctgtgga 

aatatggttt 

aggatgttct 

tagcttcgcg 

gctcccaggt 

gtactaccat 

ccaggtcggt 

gggattacag 

agatggaatc 

gaattacagg 



agcactttgg 

ggcaaaatgg 

tgtccctgta 

gctgaggctt 

gaccctgtct 

cacacggagg 

agctctgccc 

agccccctca 

tccggaagag 

tcaccgccgc 

ttctggtgcg 

ctggcagggg 

taggatgttc 

ttggtcacgt 

caacactagg 

gtgcagcctg 

tttaggccca 

cctggtggcc 

cttgctgggt 

aaacattgaa 

ggagaacctg 

ctacattcac 

cttgttagct 

caagagcccc 

cagtactggc 

gtctcccagg 

tcaagcgact 

ttttagtaga 

caagtgattc 

cccagccagc 

ctcctgtggg 

ggatacaagg 

ggtaaggggc 

tgcaaggact 

cagggcgggc 

aagccatgtg 

aaatggaggt 

attaatgtac 

ggtagggaaa 

ccaggcagcg 

gagcttcctt 

tgctgtgaag 

agggtggcca 

aacatggtga 

gcctgtagtc 

gaggttgcag 

tccattgatc 

cctgagaagg 

atctagggga 

gcactcttgg 

tgggtgaggg 

actttattga 

actgacttta 

gtaaaaataa 

tgtgtgggaa 

ctttttaccc 

cttatgcaac 

tcacccaggc 

tcaagtgatt 

gcctagctaa 

cttgaactcc 

gcgtgaacca 

ttgctctgtt 

tgcctgccac 



gaggccaaga 

taaaaccccg 

gtcccagcta 

cagtaagctg 

caaaaataaa 

ctccagcccc 

agctgggttc 

tcctcagctc 

tccccaggag 

ccactgcctc 

cattggcaag 

tctgagtcct 

tgtatacccc 

cctgactgag 

atatagccca 

tgcccctgtc 

ggaagaaaca 

tgcaggacac 

gaacctgcag 

aagatatcca 

gaccgggaca 

cctgtgtgtc 

gaggggcaga 

tttccctttt 

gttttatttt 

ctggagtgta 

ctcctgcctc 

gactggtggg 

acccgcctcg 

tctggcgttt 

ggttggctct 

ggcgggtgac 

agcccagtgt 

ccacccggat 

tgagggaaca 

actttgagca 

aaaagtctct 

ttggcatagt 

tgccattcat 

ttctagggta 

cctaggaggg 

aaaaatgaag 

gacaaggtgg 

aaccccgtct 

gcagctactc 

tgagctgaga 

gatcgatcaa 

tgatgtctgg 

ggagcaccgc 

cttgtctggg 

gagagagggg 

gtgagatggg 

aaagtaaaaa 

tacaaagatt 

tttttatata 

ataaatactt 

cacaatacaa 

ttgagtgcag 

ctcctgcctc 

tttttgtatt 

gacctcaggt 

ctgtactcgg 

gcccaggcta 

cacgcccggc 



caggaagatc 

tctctactaa 

ctaaggaggc 

tgactgtacc 

taaataaata 

aaaggcggcc 

ttagacctgg 

ctaatgcttc 

ctgctgtgtg 

ctgtacccgc 

cactcccgca 

ccaaagcgat 

ccagaatata 

gcttggagcg 

tgtgggagtc 

cccgtcctcc 

cccagggggc 

actgtctccc 

cttctccatt 

tgttggaaaa 

ttgccctgat 

tgcccgacag 

agccaagttc 

ccaggcctcg 

ttatttatat 

gttgtgcgat 

agtctcctga 

tttcaccgtg 

gcctcccaaa 

tagattctgg 

cactaggccc 

aggctggggc 

cctgcaggtg 

ccgcatcact 

gtggggccca 

agttgcctaa 

atcccataag 

atcagtcacc 

tcagtcactc 

tacagcaggg 

cacatccata 

cacagggaga 

gcagatcact 

ctactaaaaa 

aggaggctga 

tcgggcatca 

tcaatcaatc 

gcagggactg 

aggctggggg 

gagcagtagg 

ggcaggcaga 

aagttattga 

ataaaaaaat 

tcctgtatac 

tatatatgca 

gagtatttcc 

atattaaaac 

tggcacaatc 

agcctcctga 

tgtagtagag 

gattcacctg 

ccaaaaccag 

gagtgcagtg 

taactttttg 



FIG. 281 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



67/97 

18661 tatttttagt atttttagta gtgatggggt ttcaccatgt tggccaggct ggtcttgaac 

18721 tcctgacctc gggtaatcca cccacctcgg cttcccaaag tgctgggatt acaggcgtga 

187 81 gcaccagcac ctggcccaaa accaggaaat taatgatgat acaatattat tgtctaatct 

18841 atagacctta ttcaaatttt tgttagtctt gctaatgtct tttataggga aaaaaaaaaa 

18901 aaaaagcgtg tttctcaccc aggattcaat gaaggatctt tctttgtctt ctatgacctt 

18961 gacatgtctg atgagtgcag tctggttatt ttgtacactg gccctgaatc cgggtttgtc 

19021 taaggtttcc tcacggtcag gttcgggctc agtggtgcca tgtccttctt ggtgcatcct 

19081 gttaactggc acatgagaac aatttgtctc atatgtggtg agtctaactc tgacctcttg 

19141 aggaaggcaa tgtctgccaa gtttcttgct gtaacttctg tttttccctt tgtaattaat 

19201 aagaatctgg taaagagaca ctttgatgtt tttttttttt tttttttttg tgatggagtc 

19261 tccctctatc acccgggctg gagtgtgtgg tgcgatctcg gctcactgca acctccatcc 

19321 cccaggttca agtgattctc ctgcctcagc ctcccaagta gcagggatta caggcatgtg 

19381 ccaccacacc cagctaattt ttgtattttt agtagagatg gggtttcacc atgttggcca 

19441 ggatggtctc gaactcctga ccttgtgatc cgtctgcctc agcctcccaa agtgctggga 

19501 ttacaggtgt gagccaatac gcctggccta ctttgatatt ttgtattctg tttgcatcaa 

19561 aaccttctcc caactagggt gactaccaaa tggcacttat ctaattctgt cattccttct 

19621 acatttgtta gttactttat tgctttcctt cctttcattc tatcagtgtg gacttaagga 

19681 tccttacttt attctaaggg ttcacctttt ttttcttttt ttttgagatg gagtttcgcc 

19741 catgttgccc aggctggatg gagtgcaatg gcgtgatctc ggctcactgc aacgtcctcc 

19801 tcccaggttc aagcaattct cctgcctcag cctcctgagt agctgggatt acaggcatgt 

19861 gccaccacgc ctggctaatt ttttgtattt ttagtagaga cagggtttca ccatgttggc 

19921 caggctggtc tcgaactcct gacctcaggt gatccgcccg cctcagcctt ccaaggttct 

19981 gggattataa gcgtgagctc taccgtgcca ggccatactt tgttactact gttatttttt 

20041 ctgatgctca gatgatccca agtttggcct gtggaagtcc cttcaagctg gcttctgtga 

20101 cttggggaga tgttctgtca ttctttgagt actttctttc tttctggcac agcaaaatga 

20161 ttcaggttaa tcctactttc cttactgtag tgttggaacc agccatttct ccagggaacc 

20221 cttgtagtca agagtggaat ttagaactga gatctgggtg ctggcgtgtg cacattgcta 

20281 gtgggatgtc attacttcta ggctctctta gtggacagaa ccagaaaaaa attatatgat 

20341 gcatatacca atatctctat catctatata aaaaaccatg agttcctact gaaacctcca 

20401 attccattct aacaccacag gattaatttt agcttttcct tttccatatt tgtaactctc 

20461 tctgttgaca gtgagaaacc tgaccctcat tatctgtaat gcatttgcct atttgaacaa 

20521 tactagaata tagtttcaaa atcctccatc cataacacta ttaaaaccaa tcctatggct 

20581 gggctcagcc cactgcaacc tctgcctcct ggactcaagc cagcctccca ctttagcctc 

20641 ccgagtagcc agggctacag gcacacacca ccatgcccag ctaatttttg tattttttgt 

20701 agagactggg tctcactgtg ttgcccagac aggtcttgaa ctctgagctc aagtgatcca 

20761 tccaactcag cctcccaaag tgctaggatt acaggtgtga gtcaccatgc ctggcctctc 

20821 ctagtaaatt tttagaagtg gtgttgttag gtcaaaaggc aaacatgtat gtcatttttt 

20881 agagattttt aaatttcttt ccataagggt tgtaccagtt tgcatttcca tcacagtgta 

20941 tgagaatgcc tgtttcccca caaccttgcc aaaagaatgt cacagtttaa attttaccaa 

21001 tctgagaggt gagaaatagt acctgaaatt gtttaacgga catcttcaaa ttgaaattga 

21061 ggttgacaac gaatcatagt taggaccttt tttttttttt tttttgagtg ggtctcctcg 

21121 tcaccaagct gagtgcatgg cacgatttgc tcactgcaac ttccgccttc tgggttcaag 

21181 cgattctcct gcttcagcct cccaagcagc tgggactcca ggcgcgagtc accatgcccg 

21241 ctaatttttg tatttttagt agagacaggg ttttaccaga ttggccaggc tggtctcgaa 

21301 ctccttacct tgtgatcctc ccgcctcggc ctcccaaagt gctgagatta caggcatgag 

21361 ccaccacgcc tggcctaagg accattttta tataattttt tttttgagac agagtcttgc 

21421 tttgtcaccc aggctggagt gcaatggtgc aatcttggct cactgcagcc tccacttccc 

21481 tggttcaagt gattctcctg cctcagcctc ccgagtagct ggttccacag gtgcgtgcct 

21541 ggctagtatt tgtattatat aatttttttg tgaattgtct cttcatggtt ttttgcccat 

21601 tttttggtcc ctttcttatc aatttttgtg agttcttcgt atttatatta ggcctttatt 

21661 tgtgatatac attgcaaatg ttttctccta gtttgtcagt ttttttaacc tcatgtataa 

21721 tttttctggc catgcagttt aaaaaattac taggtagtca aatttatcaa tcattattta 

21781 taaatctggt ttgaacagag ataaactttc ctggccaagt gtggtgttta cacctgtaat 

21841 cccagcactc tgagaggctg aggtggggat cacctgaggt cagaagttca agaccagcct 

21901 ggccaacatg gtgaaaccct gtctctacta aaaatacaaa aattagctgg gcgtggtggc 

21961 tgatgcctgt agtcccagct actcaggaga ctgaggctgg agaattgctt gaacctggga 

22021 ggcggaggtt gcagtgagca gagatcgtgc cgctgcactc cagcctgggt gacagagcaa 

22081 gactctgtct caaaaacaaa acgacaaaaa acaacaacag aaaagccttt cctgatagct 

22141 aggtcattga ggaattcact catgttttct tctagtacct gatttcattt ttctgcactt 

22201 agattcctga ctcatatgga gtttattttt gtatctgatg tgaggcatag atctaattta 

22261 ttattttcca aatggctaac tagctgtctc taaacccttt attaaaaatt attggccaag 

22321 tgcggtagcc acacctgtaa tcccagcagt ttggaaggct gaggcaggat tgcttgaggc 

22381 caggaattca aaaccagccc agacaacata gcaagaccct gtctctacaa gaaaatattg 

22441 gtcaggtgtg gtggctcacg cctataatcc cagcactttg ggaggctgag gcaggtggat 
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26341 gaagggaaac gaggggatgc ctgtgaaggt 
26401 agcttctcta aagcccaggg cctggtgaac 
26461 tatccagaaa cagttgcctg gcagaggaat 
26521 gaaacctcat ctttcttctt cagagcccct 
26581 tctcatgggg tgaaggctgt gaccgggatg 
26641 gcccgaagaa gtggatacag aaggtcattg 
26701 ttctgggctc ctggaaccaa tcccgtgaaa 
26761 tcccaataaa agtgactctc agcgagcctc 
26821 tctgggctca ggaagagcca gtaatactac 
26881 tggtgcacgc tggtagtccg agcactcggg 



gacagtgggg gaccctttgt catgaaggta 
acatcttctg ggggtgggga gaaactctag 
actgatgtga ccttgaactt gactctattg 
ttaacaaccg ctggtatcaa atgggcatcg 
ggaaatatgg cttctacaca catgtgttcc 
atcagtttgg agagtagggg gccactcata 
gaattatttt tgtgtttcta aaactatggt 
aatgctccca gtgctattca tgggcagctc 
tggataaaga agacttaaga atccaccacc 
aggctgaggt gggaggat 



FIG. 28L 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



70/97 



ACCESSION 
NID 

KEYWORDS 
SOURCE 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
FEATURES 

source 



sig_peptide 



CDS 



LOCUS HUMPMG3BA 3997 bp mRNA PRI 08-JAN-1995 

DEFINITION Human platelet membrane glycoprotein Ilia beta subunit mRNA, 
complete cds. 
M20311 
gl90107 

cell membrane glycoprotein; platelet membrane glycoprotein Ilia. 
Homo sapiens cDNA to mRNA. 
ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3997) 

Zimrin,A.B., Eisman,R., Vilaire,G., Schwartz, E. , Bennett, J. S. and 
Poncz,M. 

Structure of platelet glycoprotein Ilia. A common subunit for two 
different membrane receptors 
J. Clin. Invest. 81 (5), 1470-1475 (1988) 
88213696 

Location/Qualifiers 
1. .3997 

/organism^ Homo sapiens" 
/ db_xr e f = - 1 axon : 9 6 0 6 ■ 
/cell_type= " erythro leukemia " 
/map="17q21.32" 
17. .94 

/gene="ITGB3 B 
/note^GOO-120-013" 
17. .2383 
/gene= w ITGB3 M 
/codon_start=l 
/db_xref="GDB:GOO-120-013" 
/product= rt glycoprotein Ilia" 
/db_xref = " PID: gl90108 ■ 

/ trans 1 a t i on= ■ MRARPRPRPLWATVLALGALAGVGVGGPNICTTRGVSSCQQCLA 

VSPMCAWCSDEALPLGSPRCDLKENLLKDNCAPESIEFPVSEARVLEDRPLSDKGSGD 

S SQVTQVSPQRIALRLRPDDSKNFS I QVRQVEDYPVDI YYLMDLSY SMKDDLWS IQNL 

GTKIATQMRKLTSNLRIGFGAFVDKPVSPYMYISPPEALENPCYDMKTTCLPMFGYKH 

VLTLTDQVTRFNEEVKKQSVSRNRDAPEGGFDAIMQATVCDEKIGWRHDASHLLVFTT 

DAKTHIALDGRLAGIVQPNDGQCHVGSDNHYSASTTMDYPSLGI^EKLS 

AVTENWNLYQNYSELI PGTTVGVLSMDS SNVLQLIVDAYGKI RSKVELEVRDLPEEL 

SLSFNATCLNNEVIPGLKSCMGLKIGDTVSFSIEAKVRGCPQEKEKSFTIKPVGFKDS 

LIVQVTFDCDCACQAQAEPNSHRCNNGNGTFECGVCRCGPGWLGSQCECSEEDYRPSQ 

QDECSPREGQPVCSQRGECLCGQCVCHSSDFGKITGKYCECDDFSCVRYKGEMCSGHG 

QCSCGDCLCDSDWTGYYCNCTTRTDTCMSSNGLLCSGRGKCECGSCVCIQPGSYGDTC 

EKCPTCPDACTFKKECVECKKFDREPYMTENTCNRYCRDEIESVKELKXJTGKDAVNCT 

YKNEDDCVVRFQYYEDSSGKSILYWEEPECPKGPDILWLLSTOGAILLIGLAALLI 

WKLLI T IHDRKEFAKFEEERARAKWDTANNPLYKEATSTFTNI TYRGT " 
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gene 17. .2383 

/gene=°ITGB3" 
mat_j?eptide 95.. 2380 

/gene= - ITGB3 n 

/note= n G00-120-013 n 

/product= "glycoprotein Ilia beta subunit" 
BASE COUNT 917 a 993 c 1099 g 988 t 

ORIGIN Chromosome 17. 

1 gcgggaggcg gacgagatgc gagcgcggcc gcggccccgg ccgctctggg cgactgtgct 
61 ggcgctgggg gcgctggcgg gcgttggcgt aggagggccc aacatctgta ccacgcgagg 
121 tgtgagctcc tgccagcagt gcctggctgt gagccccatg tgtgcctggt gctctgatga 
181 ggccctgcct ctgggctcac ctcgctgtga cctgaaggag aatctgctga aggataactg 
241 tgccccagaa tccatcgagt tcccagtgag. tgaggcccga gtactagagg acaggcccct 
301 cagcgacaag ggctctggag acagctccca ggtcactcaa gtcagtcccc agaggattgc 
361 actccggctc cggccagatg attcgaagaa tttctccatc caagtgcggc aggtggagga 
421 ttaccctgtg gacatctact acttgatgga cctgtcttac tccatgaagg atgatctgtg 
481 gagcatccag aacctgggta ccaagctggc cacccagatg cgaaagctca ccagtaacct 
541 gcggattggc ttcggggcat ttgtggacaa gcctgtgtca ccatacatgt atatctcccc 
601 accagaggcc ctcgaaaacc cctgctatga tatgaagacc acctgcttgc ccatgtttgg 
661 ctacaaacac gtgctgacgc taactgacca ggtgacccgc ttcaatgagg aagtgaagaa 
721 gcagagtgtg tcacggaacc gagatgcccc agagggtggc tttgatgcca tcatgcaggc 
781 tacagtctgt gatgaaaaga ttggctggag gaatgatgca tcccacttgc tggtgtttac 
841 cactgatgcc aagactcata tagcattgga cggaaggctg gcaggcattg tccagcctaa 
901 tgacgggcag tgtcatgttg gtagtgacaa tcattactct gcctccacta ccatggatta 
961 tccctctttg gggctgatga ctgagaagct atcccagaaa aacatcaatt tgatctttgc 
1021 agtgactgaa aatgtagtca atctctatca gaactatagt gagctcatcc cagggaccac 
1081 agttggggtt ctgtccatgg attccagcaa tgtcctccag ctcattgttg atgcttatgg 
1141 gaaaatccgt tctaaagtag agctggaagt gcgtgacctc cctgaagagt tgtctctatc 
1201 cttcaatgcc acctgcctca acaatgaggt catccctggc ctcaagtctt gtatgggact 
1261 caagattgga gacacggtga gcttcagcat tgaggccaag gtgcgaggct gtccccagga 
1321 gaaggagaag tcctttacca taaagcccgt gggcttcaag gacagcctga tcgtccaggt 
1381 cacctttgat tgtgactgtg cctgccaggc ccaagctgaa cctaatagcc atcgctgcaa 
1441 caatggcaat gggacctttg agtgtggggt atgccgttgt gggcctggct ggctgggatc 
1501 ccagtgtgag tgctcagagg aggactatcg cccttcccag caggacgaat gcagcccccg 
15 61 ggagggtcag cccgtctgca gccagcgggg cgagtgcctc tgtggtcaat gtgtctgcca 
1621 cagcagtgac tttggcaaga tcacgggcaa gtactgcgag tgtgacgact tctcctgtgt 
1681 ccgctacaag ggggagatgt gctcaggcca tggccagtgc agctgtgggg actgcctgtg 
1741 tgactccgac tggaccggct actactgcaa ctgtaccacg cgtactgaca cctgcatgtc 
1801 cagcaatggg ctgctgtgca gcggccgcgg caagtgtgaa tgtggcagcx gtgtctgtat 
1861 ccagccgggc tcctatgggg acacctgtga gaagtgcccc acctgcccag atgcctgcac 
1921 ctttaagaaa gaatgtgtgg agtgtaagaa gtttgaccgg gagccctaca tgaccgaaaa 
1981 tacctgcaac cgttactgcc gtgacgagat tgagtcagtg aaagagctta aggacactgg 
2041 caaggatgca gtgaattgta cctataagaa tgaggatgac tgtgtcgtca gattccagta 
2101 ctatgaagat tctagtggaa agtccatcct gtatgtggta gaagagccag agtgtcccaa 
2161 gggccctgac atcctggtgg tcctgctctc agtgatgggg gccattctgc tcattggcct 
2221 tgccgccctg ctcatctgga aactcctcat caccatccac gaccgaaaag aattcgctaa 
2281 atttgaggaa gaacgcgcca gagcaaaatg ggacacagcc aacaacccac tgtataaaga 
2341 ggccacgtct accttcacca atatcacgta ccggggcact taatgataag cagtcatcct 
2401 cagatcatta tcagcctgtg ccacgattgc aggagtccct gccatcatgt ttacagagga 
2461 cagtatttgt ggggagggat ttggggctca gagtggggta ggttgggaga atgtcagtat 
2521 gtggaagtgt gggtctgtgt gtgtgtatgt gggggtctgt gtgtttatgt gtgtgtgttg 
2581 tgtgtgggag tgtgtaattt aaaattgtga tgtgtcctga taagctgagc tccttagcct 
2641 ttgtcccaga atgcctcctg cagggattct tcctgcttag cttgagggtg actatggagc 
2701 tgagcaggtg ttcttcatta cctcagtgag aagccagctt tcctcatcag gccattgtcc 
2761 ctgaagagaa gggcagggct gaggcctctc attccagagg aagggacacc aagccttggc 
2821 tctaccctga gttcataaat ttatggttct caggcctgac tctcagcagc tatggtagga 
2881 actgctgggc ttggcagccc gggtcatctg tacctctgcc tcctttcccc tccctcaggc 
2941 cgaaggagga gtcagggaga gctgaactat tagagctgcc tgtgcctttt gccatcccct 
3001 caacccagct atggttctct cgcaagggaa gtccttgcaa gctaattctt tgacctgttg 
3061 ggagtgagga tgtctgggcc actcaggggt cattcatggc ctgggggatg taccagcatc 
3121 tcccagttca taatcacaac ccttcagatt tgccttattg gcagctctac tctggaggtt 
3181 tgtttagaag aagtgtgtca cccttaggcc agcaccatct ctttacctcc taattccaca 
3241 ccctcactgc tgtagacatt tgctatgagc tggggatgtc tctcatgacc aaatgctttt 
3301 cctcaaaggg agagagtgct attgtagagc cagaggtctg gccctatgct tccggcctcc 
3361 tgtccctcat ccatagcacc tccacatacc tggccctgag ccttggtgtg ctgtatccat 
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3421 ccatggggct gattgtattt accttctacc 

3481 atgagttggc tgggaataag tgccaggatg 

3541 ggcctgttct tctatgggtt ggacaacctc 

3601 acagtgcaat tttattttat ttttctcatg 

3661 atataaacat gcttgcatta tatttgtaaa 

3721 ggaaaccaca cagacttggg cagggtacag 

3781 tcactggcca gtggctggat ctgtgagggg 

3841 atgtgtggac acattggacc tttcctgagg 

3901 cagtggctcc attggtgttg acatacatcc 

3961 aaaaaaagaa agacttatca acatttgttc 



tcttggctgc cttgtgaagg aattattccc 
gaatgatggg tcagttgtat cagcacgtgt 
attttaactc agtctttaat ctgagaggcc 
atgaggtttt cttaacttaa aagaacatgt 
tttatgtgta tggcaaagaa ggagagcata 
acactcccac ttggcatcat tcacagcaag 
ctctctcatg atagaaggct atggggatag 
aagagggact gttcttttgt cccagaaaag 
aacattaaaa gccaccccca aatgcccaag 
catgagg 



FIG. 29C 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



73/97 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



LOCUS HUMATH3A3 238 bp DNA PRI 31-OCT-1994 

DEFINITION Human antithrombin III (ATIII) gene, exon 6. 
ACCESSION M21645 
NID g!79149 

KEYWORDS antithrombin; antithrombin III 
SEGMENT 3 Of 3 

SOURCE Homo sapiens ( individual_isolate Patient II -9) DNA. 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae? Homo. 
1 (bases 1 to 238) 

Bock,S.C, Marrinan, J. A. and Radziejewska, E. 

Antithrombin III Utah: proline-407 to leucine mutation in a highly 
conserved region near the inhibitor reactive site [published 
erratum appears in Biochemistry 1989 Apr 18 ; 28 (8) : 3628] 
Biochemistry 27 (16) , 6171-6178 (1988) 
89050967 

Draft entry and computer-readable sequence [1] kindly submitted by 
S.C.Bock, 20-JAN-1989. 

Location/Qualifiers 
1. .238 

/ organi sm= " Homo sapiens " 

/isolate= "Patient II-9" 

/db_xref= " taxon : 9606 " 

/cell_type= "peripheral blood cell" 

/map= B lq23-q25.1" 
gene join(M21643 : 1 . . 398,M21644 : 1 . .469,1. .183) 

/gene="AT3" 
intron <1 . . 6 

/gene= B AT3 B 

/note= B antithrombin III, intron F" 
CDS <7..183 

/gene="AT3 B 

/note= B exon 6" 

/codon__start=l 

/db_xref = •GDB:G00-119-024 ,, 

/product= "antithrombin III" 

/db_xref="PID:gl79152" 

/ transiation= ■ VNEEGSEAAASTAWI AGRSLNPNRVTFKANRPFLVFI REVPLN 

TI I FMGRVAN PCVK " 
BASE COUNT 63 a 50 c 53 g 72 t 

ORIGIN About 7.8 kb from segment 3B; chromosome lq23 . 

1 ctgcaggtaa atgaagaagg cagtgaagca gctgcaagta ccgctgttgt gattgctggc 
61 cgttcgctaa accccaacag ggtgactttc aaggccaaca ggcctttcct ggtttttata 
121 agagaagttc ctctgaacac tattatcttc atgggcagag cagccaaccc ttgtgttaag 
181 taaaatgttc. ttattctttg cacctcttcc tatttttggt ttgtgaacag aagtaaaa 
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LOCUS 

DEFINITION 
ACCESSION 
NID 

KEYWORDS 
SEGMENT 
SOURCE 
ORGANISM 



08-NOV-1994 
exon. 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
COMMENT 

FEATURES 

source 



gene 



BASE COUNT 
ORIGIN 

1 
61 
121 
181 
241 
301 
361 
421 



HUMGP2B2 623 bp DNA p R i 

Human platelet: glycoprotein lib mRNA, C-terminal 
M22569 
gl83449 

platelet glycoprotein lib. 
2 of 2 

Homo sapiens (tissue library: lambda-EMBL 4) DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 623) 

?randini,M.H. , Denarier,E., Frachet,P., Uzan,G. and Marguerie,G. 
Isolation of the human platelet glycoprotein lib gene and 
characterization of the 5' flanking region 
Biochem. Biophys. Res. Commun. 156 (1), 595-601 (1988) 
89025907 

Draft entry and computer- readable sequence [1) kindly submitted by 
M.H.Prandini, 16-FEB-1989. 

Location/Qual i f iers 
1. .623 

/ organi sm= 0 Homo sapi ens n 
/ db_xr e f = " t axo n : 9 6 0 6 " 
/cell^type^ leucocyte" 
/tissue_lib=" lambda-EMBL 4" 
/map=-17q21.32- 

join(M22568:1254. .1869,1. .434) 
/ gene= ■ ITGA2B * 
<1. .191 

/ gene= " ITGA2B " 
/note="G00-120-012 B 
192.. 434 
/partial 
/gene="ITGA2B a 

/note="last exon; GOO-120-012 ■ 
<192..251 
/gene=°ITGA2B" 
/codon_start=l 
/db_xref="GDB:G00-120-012" 
/product="platelet glycoprotein lib" 
/db_xref="PID:g463108" 
/ translation ■ VGFFKRNRHTLEEDDEEGE ■ 
144 a 158 c 181 g 140 t 
About 15 kb after segment 1. 
aaaactcagg aagaaacaaa cccaccaatc gttccaggca tatctcaaat gcaaaaggca 
tccattgtga gtacagtggg ctttcatgtt ctgcgctggt ccagggaggt gctcatagct 
acttcctcac atgtgctctg gggccagcaa atcatctgta taccctgacc ttggcccccg 
tgtaccccca ggtcggcttc ttcaagcgga accggcacac cctggaagaa oatgatgaag 
agggggagtg atggtgcagc ctacactatt ctagcaggag ggttgggcgt gctacctgca 
ccgccccttc tccaacaagt tgcctccaag ctttgggttg gagctgttcc attgggtcct 
cttggtgtcg tttccctccc aacagagctg ggctaccccc cctcctgctg cctaataaag 
agactgagcc ctgatgctga gcatgctgcc tccttttggg gccagagaag agagtaccga 
481 agaatgtttt ggacggggac ctagggctgg tggaagtatg aacgagagag tcactgccag 
541 ggcgaagttt gcaaatcact gtctttgggg agtgtcaggg agtacagagt tggggtggta 
601 ggtgtaacag aagacggaga gcc 



intron 



exon 



CDS 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 
SOURCE 

ORGANISM 



01-NOV-1994 
complete cds . 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
FEATURES 

source 



mRNA 



HUMCETP 1787 bp mRNA PRI 

Human cholesteryl ester transfer protein mRNA, 
M30185 
gl80259 

cholesteryl ester transfer protein; transfer protein. 
Human adult liver, cDNA to mRNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chorda ta; 
Vertebra ta; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1787) 

Drayna # D. # Jarnagin, A.S. , McLean , J . , Henzel,W. , Kohr.W., 
Fielding, C. and Lawn,R. 

Cloning and sequencing of human cholesteryl ester transfer Drotein 
cDNA 

Nature 327 (6123), 632-634 (1987) 
87258172 

Location/Qualifiers 
1. .1787 

/organism= "Homo sapiens" 
/db_xref = " taxon : 9606 " 
/dev_stage= B adult" 
/ tissue_type= "liver ■ 
<1..1787 

/note="CETP mRNA" 
131. .181 
/gene= n CETP" 

/note=" cholesteryl ester transfer protein signal peptide" 
131.. 1612 
/gene="CETP" 
131. .1612 
/gene= B CETP" 

/note=" cholesteryl ester transfer protein precursor" 

/ codon_s tart = 1 

/db_xref="GDB:G00-119-773" 

/db_xref="PID:gl80260 tt 

/ translations " MLAATVLTLALLGNAHAC SKGTSHEAG IVCRITKPALLVLNHET 

AKVI QTAFQRASYPDI TGEKAMMLLGQVKYGLHNI QI SHLS I AS SQVELVEAKS I DVS 

IQNVSVVFKGTLKYGYTTAWWLG I DQS I DFE IDSAI DLQ INTQLTCDSGRVRTDAPDC 

YLSFHKLLLHLQGEREPGWIKQLFTNFI SFTLKLVLKGQICKEINVI SNIMADFVQTR 

AASILSDGDIGVDISLTGDPVITASYLESHHKGHFIYKNVSEDLPLPTFSPTLLGDSR 

MLYFWFSERVFHSLAKVAFQDGRLMLSLMGDEFKAVT.ETWGF 

QAQVTVHCLKMPKI SCQNKGVVVNSSVMVKFLF PRPDQQHSVAYTFEEDI VTTVQAS Y 

SKKKLFLSLIJ^FQITPKTVSNLTESSSESIQSFI^SMITAVGIPEVMSRLEVWTA^ 

NSKGVSLFDIINPEIITRDGFLLLQMDFGFPEHLLVDFLQSLS* 



sig_peptide 

gene 
CDS 
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mat_peptide 182.. 1609 

/gene="CETP" 

transfer protein" 

BASE COUNT 397 a 531 c 456 g 
ORIGIN 

1 gtgaatctct ggggccagga agaccctgct 

61 tgggcggaca tacatatacg ggctccaggc 

121 cctgataacc atgctggctg ccacagtcct 

181 ctgctccaaa ggcacctcgc acgaggcagg 

241 cctggtgttg aaccacgaga ctgccaaggt 

301 cccagatatc acgggcgaga aggccatgat 

361 caacatccag atcagccact tgtccatcgc 

421 gtccattgat gtctccattc agaacgtgtc 

481 ctacaccact gcctggtggc tgggtattga 

541 cattgacctc cagatcaaca cacagctgac 

601 ccctgactgc tacctgtctt tccataagct 

661 tgggtggatc aagcagctgt tcacaaattt 

721 gggacagatc tgcaaagaga tcaacgtcat 

781 aagggctgcc agcatccttt cagatggaga 

841 tcccgtcatc acagcctcct acctggagtc 

901 tgtctcagag gacctccccc tccccacctt 

961 gctgtacttc tggttctctg agcgagtctt 

1021 tggccgcctc atgctcagcc tgatgggaga 

1081 cttcaacacc aaccaggaaa tcttccaaga 

1141 agtcaccgtc cactgcctca agatgcccaa 

1201 caattcttca gtgatggtga aattcctctt 

1261 ttacacattt gaagaggata tcgtgactac 

1321 cttcttaagc ctcttggatt tccagattac 

1381 cagctccgag tccatccaga gcttcctgca 

1441 ggtcatgtct cggctcgagg tagtgtttac 

1501 cttcgacatc atcaaccctg agattatcac 

1561 ctttggcttc cctgagcacc tgctggtgga 

1621 aaggaggtcg ggatggggct tgtagcagaa 

1681 ggtgtctcct ccagcgtggt ggaagttggg 

1741 aactcctccc tatcctaaag gcccactggc 



/note= - cholesteryl ester 

403 t 

gcccggaaga gcctcatgtt ccgtgggggc 
tgaacggctc gggccactta cacaccactg 
gaccctggcc ctgctgggca atgcccatgc 
catcgtgtgc cgcatcacca agcctgccct 
gatccagacc gccttccagc gagccagcta 
gctccttggc caagtcaagt atgggttgca 
cagcagccag gtggagctgg tggaagccaa 
tgtggtcttc aaggggaccc tgaagtatgg 
tcagtccatt gacttcgaga tcgactctgc 
ctgtgactct ggtagagtgc ggaccgatgc 
gctcctgcat ctccaagggg agcgagagcc 
catctccttc accctgaagc tggtcctgaa 
ctctaacatc atggccgatt ttgtccagac 
cattggggtg gacatttccc tgacaggtga 
ccatcacaag ggtcatttca tctacaagaa 
ctcgcccaca ctgctggggg actcccgcat 
ccactcgctg gccaaggtag ctttccagga 
cgagttcaag gcagtgctgg agacctgggg 
ggttgtcggc ggcttcccca gccaggccca 
gatctcctgc caaaacaagg gagtcgtggt 
tccacgccca gaccagcaac attctgtagc 
cgtccaggcc tcctattcta agaaaaagct 
accaaagact gtttccaact tgactgagag 
gtcaatgatc accgctgtgg gcatccctga 
agccctcatg aacagcaaag gcgtgagcct 
tcgagatggc ttcctgctgc tgcagatgga 
tttcctccag agcttgagct agaagtctcc 
ggcaagcacc aggctcacag ctggaaccct 
ttaggagtac ggagatggag attggctccc 
attaaagtgc tgtatcc 



FIG. 32B 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



77/97 



LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 
SEGMENT 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
MEDLINE 
FEATURES 

source 



prim_transcript 



intron 



exon 



intron 



exon 



intron 



exon 



intron 



exon 



intron 



exon 



intron 



intron 



HUMGPIIB2 13204 bp DNA PRI 10-NOV-1994 

Human platelet Glycoprotein lib (GPIIb) gene, exons 2-29. 
M33320 
gl83506 

platelet Glycoprotein lib. 
2 of 3 

Human leukocyte DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 13204) 

Heidenreich,R. , Eisman,R., Surrey, S., Delgrosso, K. , Bennett, J. S. 
Schwart2,E. and Poncz,M. 

Organization of the gene for platelet glycoprotein lib 

Biochemistry 29 (5), 1232-1244 (1990) 

90212612 

Location/Qualifiers 
1. .13204 

/organism= "Homo sapiens" 
/db_xref = " taxon \ 9606 
/map= M 17q21.32" 
<1. .>13204 

/note= "GPIIb mRNA and introns" 
<1. .497 

/note= "GPIIb intron A" 
498. .619 
/gene="ITGA2B n 
/number =2 
620. .708 

/note= "GPIIb intron B° 
709. .806 
/ gene= " ITGA2B ■ 

/note= "platelet Glycoprotein lib" 
/number =3 
807.. 911 

/note= "GPIIb intron C" 
912. .1077 
/ gene= ■ ITGA2B" 

/note= n platelet Glycoprotein lib" 
/number =4 
1078. .1292 

/note= "GPIIb intron D" 
1293. .1342 
/gene="ITGA2B" 

/note="platelet Glycoprotein lib" 
/number =5 
1343. .1418 

/note=" GPIIb intron E (no splice consensus) ; putative; 
does not fit consensus" 
1419. .1464 
/gene="ITGA2B" 

/note=" platelet Glycoprotein lib" 
/number =6 
1465. .1551 

/note= "GPIIb intron F" 
1552. .1680 
/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 
/number =7 
1681. .2041 

/note="GPIIb intron G" 
2042.. 2089 
/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 
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/number =8 
intron 2090.. 2244 

/note="GPIIb intron H (no splice consensus); putative 

does not fit consensus" 
exon 2245.. 2288 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =9 
intron 2289.. 2460 

/note="GPIIb intron I" 
exon 2461.. 2514 

/gene= ■ ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number=10 
intron 2515.. 2652 

/note="GPIIb intron J" 
exon 2653.. 2705. 

/ gene= B ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number=ll 
intron 2706.. 2896 

/note="GPIIb intron K" 
exon 2897.. 3108 

/gene="ITGA2B" 

/note="platelet Glycoprotein lib" 

/number =12 
intron 3109.. 5535 

/note="GPIIb intron L" 
exon 5536.. 5718 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =13 
intron 5719.. 5951 

/note="GPIIb intron M" 
exon 5952.. 5997 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib* 

/number =14 
intron 5998.. 6105 

/note="GPIIb intron N" 
exon 6106.. 6210 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =15 
intron 6211.. 6294 

/note="GPIIb intron 0" 
exon 6295.. 6350 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number= 16 
intron 6351.. 6442 

/note="GPIIb intron P" 
exon 6443.. 6594 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =17 
intron 6595.. 6782 

/note="GPIIb intron Q" 
exon 6783.. 6908 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =18 
intron 6909.. 7885 

/note="GPIIb intron R" 
exon 7886.. 7953 
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intron 
exon 

intron 
exon 

Glycoprotein lib" 
intron 
exon 

intron 
exon 

intron 
exon 

intron 
exon 

intron 
exon 

intron 
exon 

intron 
exon 

intron 
exon 

intron 



/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number=19 

7954. .8086 

/note="GPIIb intron S" 
8087. .8234 
/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 
/number =20 
8235. .8802 

/note="GPIIb intron T" 
8803. .8895 
/gene="ITGA2B" 

/number =21 
8896. .9505 

/note="GPIIb intron U" 
9506. .9585 
/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number=22 

9586. .10201 

/note="GPIIb intron V" 
10202. .10282 
/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =2 3 

10283. .10405 

/note="GPIIb intron W" 

10406. .10505 

/gene=" ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =2 4 

10506. .10604 

/note="GPIIb intron X" 

10605. .10757 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =2 5 

10758. .10873 

/note="GPIIb intron Y" 

10874. .10999 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =2 6 

11000. .11477 

/note="GPIIb intron 2" 

11478. .11591 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 

/number =27 

11592. .11827 

/note="GPIIb intron AA" 

11828. .11929 

/gene=°ITGA2B" 

/note="platelet Glycoprotein lib* 

/number =2 8 

11930. .12116 

/note="GPIIb intron BB" 

12117. .12233 

/gene="ITGA2B" 

/note= "platelet Glycoprotein lib" 
/number =2 9 
12234. .>13204 
/note="GPIIb intron CC" 



/note="platelet 
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1081 
1141 
1201 



BASE COUNT 3046 a 3579 c 

ORIGIN About 2000 bp after 

1 ctgcaggtca acggatctgc 
61 agaggtaccc gctaccttcc 
121 ctaggcaggc attccaggga 
181 ggtgttgtac agagtttagg 
241 tcaaaccaaa ggggattata 
301 gagattgccc tcgctgagag 
361 ggtctgtgag gtgtcattga 
421 gcagtgctcc cagcgccggc 
481 ccagctttcc tatgcagagt 
541 caggaggaga cgggcggcgt 
601 tcgctgctct ttgacctccg 
661 cgtggactgc ccgggcttca 
721 gaaatgtagg ctcccaaact 
781 tcgtcagctg gagcgacgtc 
841 gggggcaggg acactggggc 
901 ccctttctca ggcctgcgcc 
961 ctgagaagac gcccgtaggt 
1021 agtactcccc ctgtcgcggg 
agcgccagct acgacctggc 
tcccgccccc agcgccgcag 
ctcaaggccc cgcccctgtc 
1261 cctgggctga cccctcctcc 
1321 cttcagctcc gtggtcactc 
1381 aacagggccc cctctcaccc 
1441 ctcctggcgg ctattatttc 
1501 ggccgaagga gaccgctttg 
1561 cccaggctcc agttgcggat 
1621 tgtcctccca gagcctctcc 
1681 gtaacaccgc cattccagac 
1741 ggtcctgccc ctgtgggagc 
1801 gctcccgccc tccgctcctg 
1861 tcccttccac tgcggactcg 
cgtttttcca tctgcacaat 
gccctccgtc ccctctgtgc 
2041 gggtactcgg tggccgtggg 
2101 acttagggcg ggagttgggt 
2161 atgtagctgg gtgcagaacg 
2221 gagcctggct ctccctatcg 
2281 tgggagcggt aagtgccccc 
2341 ctgacaactc ctgagcgccc 
2401 ctggagtggg aggttgcttt 
2461 gtggaaattt tggattccta 
2521 gccaggtccc agtgggcgtg 
2581 ggaggtgagg gcccatttct 
2641 ctcatcttgc agatggcgtc 
2701 gatgggtgag gagggacatg 
2761 gcccctctgt ctccctttcc 
2821 aagggtcgag gagatttggc 
2881 ctcatctggc ccacaggagg 
2941 gggcagaccg aaaactggcc 
3001 cccacgcgct gggtgccccc 
3061 gctctgccat cgcacccctg 
3121 aggagcccta cttgctgcag 
3181 gggcagccag aaccaggatg 
3241 gctgagtgga gagcagatgg 
3301 agcaagagac aatgaccacc 
3361 cttcacagat atttaggact 
3421 ggggagaggt tggagttggg 
3481 agcaggtgct ggggagaggc 
3541 gggcttgggt gctttaggcg 
3601 ccacaagaga gatctgaatg 
3661 ctgtgaaata agaggcccag 
3721 caggaggtaa gtctgagaag 



1921 
1981 



3857 g 2722 t 
segment 1. 

tagggtcctc ctatcagcac acacactcca gccccacttt 
ctcattaaaa ccagctctca agaggggatc tggtaacagt 
gcatgtgaac cgctggttct tgttgcgggt ggaggatgga 
tctttttcag caaagatctc caaaccccgg gtgttcaaaa 
gtcccagctc tactcacaac tcactggtta ctttagccac 
tcggtttcac tgtccataag atgaagaagt acatcacggt 
ggaaagatgg tccagtgccc ccatgccaca tggccttcgg 
gccagggcct gggatacgct ggaatctgcg cggcgctcac 
ggccatcgtg gtgggcgccc cgcggaccct gggccccagc 
gttcctgtgc ccctggaggg ccgagggcgg ccagtgcccc 
tgagtcccag gcaaggagag caaggttggg gtcagaggga 
gcgccccacc ccttcttgtg ccttccaggt gatgagaccc 
ttacaaacct tcaaggcccg ccaaggactg ggggcgtcgg 
attgtggtgg gccccgcggt acagggcaca gggaacaatc 
caggaggagc ccaagtctcg cgccccgtcc ccatctgtgg 
ccctggcagc actggaacgt cctagaaaag actgaggagg 
agctgctttt tggctcagcc agagagcggc cgccgcgccg 
aacaccctga gccgcattta cgtggaaaat gattttagta 
cccgcccact cgcgacggct tggccccgcc ccccatcgga 
cccttgcttt ggatctggcc tcgccccagg gccccgccga 
ccccagccct cctccgggct cgcgcgcgcc tcccttcacc 
ttgtctcctc aggctgggac aagcgttact gtgaagcggg 
aggcgagtag ggagcaaaag cgcagtgggg gcggctccca 
tcaggacttc ccttccaggc cggagagctg gtgcttgggg 
ttaggtacgt gcccatccgt acacctccct cccttctcgc 
ggcttcacac ccgctgtccc tcccgcccta ggtctcctgg 
attttctcga gttaccgccc aggcatcctt ttgtggcacg 
tttgactcca gcaacccaga gtacttcgac ggctactggg 
ttccagcacc ccgagggtca ccgcccaccg cagacggtca 
ctccatggcc acccctgccg gccaacccac cgcctaagcc 
cgcttccccg cagaccgccc acctcccatg cgcccaccgc 
tagcgcagcc tggggcaggg cttggcccct cgaaggcctc 
gcagggctgg ggctgagtgg ccttaatctc ctccttcttt 
ttcctcccct ggaaaagact aatttgcgcc cttgtcctca 
cgagttcgac ggggatctca acactacagg caagaaatcc 
agcccagccc ggggaggagc gccttcctga aatctcccct 
gggagcggga agtgggtagg ttctaaggct ctcattccct 
ccagaatatg tcgtcgtgcc ccccacttgg agctggaccc 
accactgggc ctcccgaagc cccttatccc agttctcagg 
cccacccccg ccccgcctcc accaaaccac cctttctcac 
gggtacaaga atgatgctct cgcctgcgct gtccgtgcag 
ctaccagagg ctgcatcggc tgcgcggaga gcaggtgggg 
gctgggtgga gggggaactg agacttcaga atatttcatg 
taaagaggat gcttgtccag cggcgtgaat gatggtgctc 
gtattttggg cattcagtgg ctgtcactga cgtcaacggg 
cccccacccc tacccagttg ggtcccaaat taccagagct 
tagccctagt ctcacgtatc cactggagga acaggagagc 
cctagcccca atatacccct ggtccagtcc catgtaacca 
catgatctgc tggtgggcgc tccactgtat atggagagcc 
gaagtggggc gtgtgtattt gttcctgcag ccgcgaggcc 
agcctcctgc tgactggcac acagctctat gggcgattcg 
ggcgacctcg accgggatgg ctacaatggt gagggaagag 
aggggttaac agccactcaa aaagcatgga gttggcctga 
ggttttaagc atataagtat gtggcttaga cacatggggt 
gagagttgaa gactaattag gaagtgtttg ccttaatcca 
tggatgtgga ttttggcagt ggagttagag atgggagtga 
cggattatta ggacttggtg ggagactgga tgtggggcca 
tgcctgtgat ggcctccact gcctggaact caggccgtgc 
gggagatcag cagttcagct ctggacctgt tgagcttgaa 
gaaatatcca aagaacagtt gggagtggct ctccccgctt 
ggagacaggg gtttggggaa agtggatgag gtcccgggac 
gatagagccc tagggagcaa aagcatttag gtgactccta 
gagacagagg agtgtccaga gagggaggag ggaacccagg 

FIG. 33D 
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3781 gggtctgatg gcccgggact caaggaagag 
3841 tgggcgctgc agctcctgct gctgctgcaa 
3901 catgggcttt agcaagaaga gggtgccagg 
3961 cttgggaaat tgaagcagga gaatctcttg 
4021 agcttgcgcc actactgcac tccagcctgg 
4081 taaaaaaaaa aatagagaaa gaaaggaaga 
4141 agtgacataa attgattcag gccaagatag 
4201 tatgaatgga gatgaaaaat tggatgcagc 
4261 ggtaaaaagg aatttgagga atagaaagga 
4321 aaaagagaaa aggtgatcac agaaaagaga 
4381 gaagaacatg tagtaggttg aaaatgatgt 
4441 gagtccctgt gatgcctcag ggggtgggag 
4501 ggaaacctct tccagggtca aatggggaaa 
4561 cataacagtg ggctgcctct cttcctgaag 
4621 gtggggggtc tgggagtttg atggaatgga 
4681 gcacggtggc tcacacctgg aatcccagca 
4741 gagcctagga gtttgacacc agcctggcca 
4801 tttaaaaatt agctgggcat ggtggctata 
4861 ggattgcttt agtccagaag gttgaggctg 
4921 agcctgaatg acaagtgcaa gactgtctta 
4981 tggctcacac ctgtaatcca gcactttggg 
5041 ggagttcgag atcagcctgg ccaatgtggt 
5101 ttagccgggc atggtggtag gcgcctgtaa 
5161 aatcacttta acgggggagg cagaggttgc 
5221 gccaggacaa cagagcgaga ctccatctca 
5281 gatgcttaat tttcaggata tattttcctc 
5341 taacaatcct acttggcagg tccctctccc 
5401 agtgcctcct tcacccacac tttgcacccc 
5461 agctcaggaa agttttacag tcatctaggg 
5521 tgttactcct tccagacatt gcagtggctg 
5581 aagtgctggt gttcctgggt cagagtgagg 
5641 acagcccctt ccccacaggc tctgcctttg 
5701 atgacaacgg atacccaggt gccctggact 
5761 cttggacatt cgctggaagt gccaagagac 
5821 ccactatgga ctgccagagg ggctgggtga 
5881 cccctgggaa gatgagatga ggatccccat 
5941 gatgtctata gacctgatcg tgggagctta 
6001 agcactggct ccaggggcgg gatggggaag 
6061 ggaggagcca caatggcaag cctccccatc 
6121 gaaggcctct gtccagctac tggtgcaaga 
6181 cctacctcag accaagacac ccgtgagctg 
6241 gatctgggac ctcagaaagg ctccaacccc 
6301 catccagatg tgtgttggag ccactgggca 
6361 tgaagggggc aggagggagg tgggcttgga 
6421 actcttctgc ttgccctgcc agccctaaat 
6481 cgccagggcc ggcgggtgct gctgctgggc 
6541 gatctgggcg gaaagcacag ccccatctgc 
6601 ccaggcaggg gattggcagg gctgggagag 
6661 gccctggggc actgagctgg gtgctgtgag 
6721 tggccaggag aaggtgggat gtgtatggta 
6781 aggatgaggc agacttccgg gacaagctga 
6841 taccgcccac ggaggctgga atggcccctg 
6901 aggagcaggt agggacaggc agggacaggc 
6961 caggattagg gttagtgcca agtcacaatg 
7021 cctaatgaaa acctcaaaat ccagccagtc 
7081 ttgggagacc gaggcaggca gattgcctga 
7141 atggtgaaaa cccatctcta ctaaaaatac 
7201 gcctgtaatt ccagctactc gggaggctga 
7261 gaggttgcag tgagccaaga gtgtgccaca 
7321 ctgtctcaaa aaaaaaaaaa aaagccaggc 
7381 tttgggaggc caaggcgggt ggatcacgag 
7441 cagtgaaacc ccgtctacta aaaatacaaa 
7501 cgggtacctg tagtcccagc tacttgggag 
7561 gggcggacgt tgcagtgagc cgagatagtg 
7621 agactccgtc tccaaaaata aaaaaacacc 
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catgcgttaa agagcatgca caggaggaag 
gatacaatta ggtggggctg gagaaatatt 
catggtggct catacctgta atcccagcta 
aacccgggaa gtggaggttg cactgagctg 
gtgacagagc aagactccat ctcaacaaaa 
aagaaaaaag aaggggaggt tattggtgac 
ggtcagaagc cagaatgcaa tggggtaagg 
taatgtagac agctctttca acaggtttgt 
aaaaaaaaaa catgtttgac tataagagga 
tgagggtcaa gggaagatta tttcaatgtg 
tgtggggaaa tggggggatg agccagcaga 
ggtgactggc ccagtgtcag ggtgaaggaa 
gggaaaaaga aagttggtgt gggattatag 
taagagatta cgtcacctgc tgaaggaagt 
gaaggctaga aatagatgct agatggccag 
ctttgggagg ccgaggcagg aggatcactg 
acatagggag atctcgtctc cataaaaatt 
gtctcaactg cttgggaagc tgaggtggga 
cagtaagcca tggttgcacc actgcacttc 
aaataaaaaa tttaaagggc ttgggcacgg 
agcccaaggt gggcagatca cttgaggtca 
gaaaccccgt ctctactgaa aatacaaaaa 
tcccagctac cgaagaggct gaggcacaag 
agtgagccga gatcgcacca ctgcactcca 
aaaaaaaaaa aatttagaaa agggaataat 
aatagacagt gagagttgtc actgttttta 
acctgattgt taactcctgg agggtagggc 
tttcctagtc tcctgggacg ttcccagaga 
aggctgaata acaatcagcc acttcctttc 
ccccctacgg gggtcccagt ggccggggcc 
ggctgaggtc acgtccctcc caggtcctgg 
gcttctccct tcgaggtgcc gtagacatcg 
gcctccagct agaaatgccc aagaaaggcc 
acggccaggg ctcatgcctg gcctggtgtc 
aacctccagt gggggaggtg gtgtggggaa 
accctaatcg ccaattctga cccattcctc 
cggggccaac caggtggctg tgtacaggtg 
gtcctgtgcc atcaagagga ggccaggcca 
accctatccc atcagagctc agccagtggt 
ttcactgaat cctgctgtga agagctgtgt 
gtgaggaggc agagggcatg ggccttaaag 
tgagccccac ttacgtcttt gcagcttcaa 
caacattcct cagaagctat gtgagtggca 
ctcccccgga ggctggccag ggaggtcctg 
gccgagctgc agctggaccg gcagaagccc 
tctcaacagg caggcaccac cctgaacctg 
cacaccacca tggccttcct tcgagtacgc 
tagaacttac ccactggact tgttcatcta 
tccgggggtg gtcaggacac aggtgcctac 
gcaagatggc ctgactcttg cccctgtcct 
gccccattgt gctcagcctc aatgtgtccc 
ctgtcgtgct gcatggagac acccatgtgc 
cagggaggtg caggacccct gatagcaaat 
taaccccaaa accttgatgt cattccaaac 
atggtggctc acacctgtaa tcccagcact 
ggtcaggagt tagagaccaa cctggccaac 
aaaaaaaatt agccgggtgt ggtgacgcat 
agcaggagaa tcacttgaac ccaggaggca 
gcactccagc ctgggtgaca gagcaagact 
gcagtggcct cacgcctgta atcccagcac 
gtcaggagat caagaccatc ctggctaaca 
aaaaaaaaaa aaattagctg ggcgtggtgg 
gctgaggcag gagaatggcg tgaaccccgg 
ccactgcacc ccagcctgga cgacagagcg 
tgaaaatccc agtatcccct aagctctgat 
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gtaaattgac 

atctacaaac 

aagcccactg 

cgcacacccc 

atgtgtgccc 

cctggccctt 

tgggcccaga 

tcctagttgg 

gggcctatga 

taagcaatgt 

ggagccttgg 

aaagatgtaa 

caaaactcta 

aacctctgtt 

cacataagta 

ctctaatgtt 

agcaatacac 

cctccctacc 

atgatgctct 

tgtaatcaga 

aagaagaacg 

gaataaccag 

tcctcacact 

tttaccagtg 

gcctgtaatc 

gaccagcctg 

catggtggcg 

aacctgggag 

acagagtgag 

gggtgaagac 

gaccttggcc 

ggaagaggct 

gcgtgcctac 

ccatacagaa 

tgcgaggcca 

tggtgaaacc 

aatcccagct 

aggttgcagt 

cgagattcta 

tcctcagaga 

agtgtggcat 

ccccagaaag 

gcaagaacag 

aggcccaagt 

ggctctaaac 

ctctgggggt 

aagaaggtga 
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ctgggactgt 

acctgctcta 

ctgtcaaccc 
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tttgcccccc 
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ggtcctgggg 

acgatgcttc 

aggagatggc 



aaaccctgac 
tccttttcgt 
ttttcctaac 
aatgtgcccc 
cagcttcagc 
tctgcctatc 
cccaggctcc 
ggcagataat 
agcagagctg 
cgaggtatgg 
ctctctcatc 
tttttttttt 
ttacaaaaac 
aacatttggt 
tatatttatt 
ttgtgttttt 
acactagcat 
ttggcacaca 
gtaatttctt 
agaaggagaa 
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attttagggg 
ccctttgcca 
ggcttccagg 
ccagcactct 
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cacgcctata 
gtggaggttg 
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ggggagtctg 

ccaccaccct 
gggtctttcg 
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ccatctctac 
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tctcaaaaga 
tttagcgagt 
gctctttgta 
tcactgggct 
ccagaatcca 
ggagctgega 
tccagggggc 
cttgggcacc 
gagggagcag 

ggggagcetc 
agctccccac 
gaatggtctt 
catcctggat 
tctcaaggta 
gggcacctct 
caggtggact 
egggatcgea 
gttctegtag 
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ctgaaagaca 
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ctgggactat 
tcctgatggg 
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attgtcccaa 
cctcagatct 
cctgatgtaa 
tctagacacg 
tcactgccag 
atacctgctc 
ctggcttcac 
gtcctggagc 
gccgtgcacc 
cccccaccct 
tccctccctg 
aatttggagg 
cagaaaaaca 
ggatttcctt 
ttttatgttg 
tatttccaaa 
gtgacagtcc 
aatctttcca 
tcttggaact 
tgagaccagg 
gctgctgggt 
tgaggtttta 
acattgtcct 
taaaatagaa 
gggaggccag 
caaaaccccg 
gtcagagcta 
cagtgagccg 
ggaaaaaaaa 
attcagggca 
cccagatagg 
tgtccttcca 
tcccccgtct 
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atacagcccc 
agagctgggt 
gtgggctggg 

gggggctgee 

gacagatctt 
tgagcaggct 
gaggagggag 
agcacttcaa 
ettgeacttt 
ttatcataga 
gtgagcaagc 
aaggccaccg 
gggggatgat 
tgegactegg 
egggecatgg 

FIG. 33F 



acctccaaat 
tcttactccc 
tccctaaacc 
aatcgtcctg 
cgtgtgagga 
cacaccttag 
tcctctttcc 
tgcagatgga 
tgccccaggg 
gggaacagta 
agagtccctc 
aggatacttg 
aaaaaggttt 
ccagtctttt 
ttaatatagt 
atgaaaatgc 
ettgagegae 
gaccttccaa 
gccttcctga 
gtggtgctgt 
cgtggtaccg 
gagecacata 
tgggtgagtg 
ataataatgg 
agcgggtgga 
tctctactaa 
ctegggaggt 
agatcatgee 
aaaaagaaaa 
gggctgtcct 
aatcgcgatg 
getgeagata 
gacccccgtg 
cagtggctca 
gtcaggagtt 
aattagctgg 
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acgtgtcctc 
tgagtgtggg 
cccagccaga 
tgaggtttga 
ccccaggtgt 
gtgctggtgg 
aaggtgaggt 
ctcacaggga 
tctatgtctc 
ccagcacttt 
ggccaacatg 
catgcctgta 
gcggaggttg 
cctccatctc 
aaacaaacac 
atagcacctg 
tagagggtat 
gatgcacttc 
tagaccctat 
gaggagtctc 
acccacaata 
ataagaaaaa 
gaggccgagg 
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LOCUS HUMHCF2 15849 bp DNA 

DEFINITION Human heparin cof actor II {HCF2 ) qene 

ACCESSION M58600 J05309 

NID g!83907 

KEYWORDS heparin cof actor II; serpin. 

SOURCE Human DNA. 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 

^ Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

REFERENCE 1 (bases 1 to 15849) 

AUTHORS Herzog,R., Lutz,S., Blin,N. , Marasa,J.C, Blinder, M. A. and 
Tollef sen,D.M. 

TITLE Complete nucleotide sequence of the gene for human heparin 
coractor " 

II and mapping to chromosomal band 22qll 
JOURNAL Biochemistry 30 (5), 1350-1357 (1991) 
MEDLINE 91120782 
FEATURES Location/Qualifiers 
source 1. .15849 

/organism^ "Homo sapiens ■ 
/db_xref = ■ taxon : 9 6 06 " 
/map="22qll.2" 
exon 1750.. 1796 

/gene=°HCF2" 
/note="G00-120-038" 
/number =1 

/product= "heparin cof actor II" 
9 ene join(1750. .1796,6948. .7852,11623. .11896,13654. .13798, 

14527. .15372) 
/gene="HCF2" 

mRNA join(1750. .1796,6948. .7852, 11623 . . 11896, 13654 . .13798, 

14527. .15372) 

/gene="HCF2" 

/note="G00-120-038" 

/product = "heparin cof actor II" 
exon 6948.. 7852 

/gene="HCF2" 

/note="G00-120-038" 

/number =2 

/product=" heparin cof actor II" 
CDS join(6964. .7852,11623. .11896,13654. .13798,14527. .14718) 

/gene="HCF2" 
/codon_start=l 
/db_xref="GDB:GOO-120-038" 
/product=" heparin cof actor 11" 
/db_xref= « PID : gl 83 908 ■ 

/ trans la t ion= " MKHSLNALLIFLI I TSAWGGSKGPLDQLEKGGETAQSADPQWEQ 

LNNKNLSMPLLPADFHKENTVTNDWIPEGEEDDDYLDL 

PTDSDVSAGNI LQLFHGKSRI QRLNI LNAKFAFNLYRVLKDQVNTFDNI FI APVG I ST 
AMGMISI^LKGETHEQVHSII^FKDFVNASSKYEITTIHNLFRKLTHRLFRRNFGYTL 
RSVNDLYIQKQFPILUDFKTKVREYYFAEAQIADFSDP^ 

AIiENIDPATQMMI LNCI YFKG SWVNKFPVEMTHNHNFRLNEREVVKVSMMQTKGNFLA 

ANDQELDCDILQLEWGGISMLIWPHKMSGMKTLEAQLTPRVVERWQKSMTNRTREV 

LLPKFKLEKNYNLVESLKLMGIRMLFDKNGNMAGISDQRIAIDLFKHQGTITVNEEGT 

QATWTTVGFMPLSTQVRFTVDRPFLFLIYEHRTSCLLFMGRVANPSRS " 
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exon 



exon 



exon 



BASE COUNT 
ORIGIN 

1 



4477 



11623. .11896 
/gene="HCF2" 
/note="G00-120-038" 
/number =3 

/product = "heparin cof actor II" 

13654.. 13798 

/gene="HCF2 B 

/note="G00-120-038° 

/number =4 

/product = "heparin cof actor II" 

14527.. 15372 

/gene="HCF2" 

/note="G00-120-038" 

/number =5 

/product^ "heparin cof actor II" 
a 3814 c 3642 g 3916 t 



361 
421 
481 
541 
601 
661 
721 



gggctttgca tgtgtgagaa caagacagag aatgagggag 
61 gggcacagac agcagcctct gcctgtggtg ccacgctgaa 
121 agatgaaggc tctaagaaga cagctctgac aaaagctaga 
III «*~ CCaCCg gtctgtgtcc tgaacacaat ggacctttac 
til ESS? cagacacccc catgggcccc ttgcacaccc 
301 cattctctct tcagatagac tctgggtgcc gacactccca 
tctctgtgat aagctgatct tccagacaat ccagaatatt 
aaaatttaaa acacaaatta aaaaacaaat tatcataagg 
ctgtaatccc agcactttgc aaggctgaag caggaggatc 
gaccagccta ggcaacatag tgagaccctg tctctacaaa 
catggtggtg tgcacctgta ttcccagcta cttgcagggc 
cagctcggga ggttgaggct gcagtgagcc aagatcacgc 
aacagagtga.gaccctgtct caaaaaacac atagggccag 
841 ™?« agCa Ctttgggagg ccgagacggg aggatcactt 
^^ Ca ac atagtgaa accccgtctc tactaaaaat 
901 ggtggtgtgc gcctgtaatc tcagccactc aggaggctga 
961 tgggagacag aggttgcagt gagctgagat cgcaccactg 
1021 cgcgaaactc tgtctcaaaa caaacaaaca aacaaacaaa 
1081 atcacagcct cagagatccc cacgaatgcc taagtggccc 
1141 cagtaatagt cctatctgtc ccacaacaga caggagtgct 
1261 a ^f a ^ a acccttgact gaagaaaggt ccatgccaca 
1261 actaattttg tcctctctcc tccacctttc actgaggaac 
^acccgcct agtagctgag ccagccacat cagtcctgga 
13 81 ctgtgatcat cccagaagag aggacacagt tggaggcaga 
ctaccctcaa tgcagcctgg tccccagagg cctgaagagc 
tcaagagggg ctgctcctgc accaaggcta tgtgtgcatg 
tactcaaagt gtcagctcta agaactggag atgaggagct 
1621 caaaggcaca gctgaggggg tttgtgctgl cclag?tggt 
1681 acttatttac tttggaaaat atgcagcaac agcccagcac 
lsSi at?™?^ f ttcatctctg aagcgccact 

i«m ?55?? g 55? ta atgtttct gctgattata aattattttt 
1861 tggttcattt ttctagcaaa ctaagaattc agaagctttc 
III] aaatggtttc atttttcagt gtgcctatta taaaattgtg 
2041 ?55 gaCaaa ° ttagaata gg agctgtggaa tagatgaaaa 
2041 aatcgaattg gataactgtc ctgtgattat gtatgagaat 
2101 tccctgaagt attagtatta aaggttagag gggccgggtg 
^ Caacac 5 t tgggaggccg aggcgggtgg atcacglgg? 
2221 gaccaacatg gtgaagccaa gtctctacta aaaatacaaa 
2281 acgcgcctgt aatcccagct actcaggagg ctgaggcagg 
2341 ggcagaggtt gcagtgagcg gagatcgtgc cactSIactS 
ilc} gactcc ^ tca aaaaaaaaaa aaaaaagaag aaaaaagaaa 
2461 ataggagacc tactctcaaa tggtctagaa gaaaaaatgt 
2521 acacacacgt acgtacacac acacacagat aatgacaggg 
111, a "5 ggtaaa tctcggtacg ggtatacagg agttgttcta 
2641 ttttggaagt ttgaacttac ttcaaaataa aaag?tttcc 
2701 tctcccattc tgcctgctct gttgggcctg gagaccatac 

38 a^ aagtgtt ? tgctctgat gcgtgactgl aaaggccaac 
2821 gaaagcacaa tatgaagttc ccaggaaaaa aaaaaagcaa 
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1381 
1441 
1501 
1561 



gtgggcccca 
gactcagtat 
gtgcaaaatc 
actctggaat 
gcagattctc 
aacatgctct 
cttaaaactt 
ccgggcacag 
acttgagccc 
aaagtcaaaa 
tgaggtgagg 
cactgcactc 
gcgtggtggc 
cactccagga 
acaaaaaatt 
ggcaggagaa 
cactccagca 
cacccataaa 
tgaatttggg 
gggctgcacc 
atccccttat 
gagctcttgg 
gagcaggtgg 
tgcatggtct 
gccttgttta 
ctaacacagt 
gcaagccact 
tgcctggtgt 
caaagttcac 
tctcagaaac 
ggtgtttacg 
tacactgttt 
tcagttccat 
tattgtactt 
atccttgctc 
cagtggctca 
caggagttca 
aattagctgg 
agaatcgctt 
cagcctggac 
aaatgttaga 
gtatgtgcat 
caaaggttcc 
ctacactatt 
aaactttagg 
accaggaggg 
ccagctctgg 
aacaaacttt 



cgaggagtgt 
tgtatgtgac 
agactcagac 
ttctcaaacg 
ctaggagtca 
tgaggagcag 
tttagatcat 
tgactcatgc 
aagagttcaa 
gttagctaga 
aggattgctt 
cagcctgggt 
tcacgcatgt 
gttcaacacc 
agttggacat 
cgcttgaact 
tgggcagcag 
cacaaaatgt 
aggcactgct 
tactggcaac 
tctgtaagcc 
aaggacaggg 
agggcagatg 
ctactttcag 
tgtggtgacc 
aaccgtcata 
ctacagttat 
ttggattggg 
atcaaaatcc 
acagaggtaa 
gataggcaac 
tagaagtggg 
tgttgggaga 
atattaaatt 
ttgggtattt 
cgcctgtaat 
agaccagcct 
gcgtggtggc 
aaacccggga 
aacagagtta 
ggaacaagat 
gcctgtgaga 
aaaattttaa 
ctttcaacat 
cagttacttc 
atgacggttt 
caattagcaa 
tgaatgattt 
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llll taattt^caa aca^?^ ctcttcaaac agtaatctgg atttaatcac aacctagtga 
llll taccttcaqt ^ttaarr* Ca " gtttgt tatactaaat agcaaaacat caggaaga^t 
ca ^ cccca 9 a tctttaattt caatccataa aagatatcag acratattttc tccttcctet 
llll f.allT.T tgacgaaaac tatttttggc tttttatcaj afaatgtggg aacagjgtat 
llll caltc?atct taaTr-^ " Ctgaatac cgggataaaa catgcatgfc t t tal?ctgc 
llll tttgttcttt alcctta?^ a t^^ ttCCt gaatgcttat ttattcaagt tggtttttgt 
3301 ctt^toactt t^h^. ttatctgaga agaaaacatt ttcccccttt gttccttctt 
jjox cttttggctt tcttttttaa aatagagatg aggtcttgct atgttgctcc agctacrtctt 
IAI gaactcctgg gctcaagcga tcctcctgcc ttggcctlcc aagatgctaa SttlSggt 
3421 gtgagcccct atgcctggtc ttcttcttct tgatcttagc caLaggcca agaagtgala 
^gaggac a «tgaagtg tagttgggca aggagccttc taccagctgc SSEtttcS 
3601 taaaaaof a o ttttaaaagt gtgttgctat tgatacacag tctcctgaL tgtaaaatgc 
llll aagctaagtt actcaaagtg ccattcagaa actgggccca gttctatttg 

3721 ZlarlZrt? ^attagaaat "tttctaga ggctgagcat ggtalctcat acctgtaat? 
3721 ccagcacttt gggaggccaa ggcaggagaa ttgcctgagc tcaggagttt gagacctgtc 
HJ; ^ gggcaaca 5 ggtaaaaccc catctttacc aaaaacacla aaafltlact gggtttggtg 
llal ^^ aCaCCt g 5 ggCccca 9 ctacttcaaa aggctgaggt gggagggtct c??gaglltg 
llll aaS™ f BBcagtga accaatattg tgccactgca c?cclg?ctg ggtgacagag 
4o" Sotaaac ™^ ataaaaataa aaagaaatcg tttctlgaaa ligltttccl 
4081 lllltlln^ ^ agtgg = act gcagcctgag gcaggtgctg agatggggac ctggaaaagg 
till a aa ct«ctt la fS aaa <=aatgtg actttcctgc tccaaaatgt gclattcafa 
4201 agttgtgact aaaacaaact ttgaacttac tatttcaaca gtattataag 

till lltln^n aaggaat ^9 actggcactg ggaaaacagc taggaagctg ctctgcacgg 
AH ^ a ?? 9agtC cg 9aagcatc ctggtactcc agagcgaaca aggctgagcg cttgatgtgg 
^aMlaaaaaac? ' ^ C " ggttc 9 aa tctagccact gSLcttatt agtgacfgS 
4441 gtacc a ?ata atatataaaa tgttgggagg atgaaactaa gttacacgai 

4501 Sacatttc 3 SL'f " catccaa « agaggccatt atcaacatta accacactga 
4561 f?££!?£f! agcagagtat ccgaacagtt accccatctt caggcctact gagttcaaat 
4561 atttgcttaa caagagcagc cagtaactct tacctggcct caactggcag cagatattct 
till glaatttttt StSttSt a ' aggaaatg gtcaclgaca caaaafalgc ttlacaaaag 
4741 ?cclaattaa S^S^" " ttgttttc tgttttttga gataaggact cactctatcl 
4801 tlcttcllll i£*?°? 9 tggcgt 9 at c acggctcact gcagactcaa gtgatcctcc 
till ttttcttftc tttttt?^ 9 a * gggaccac aggcgtgtgc catcacacca ggctaattat 
4551 tttttttttt ttttgagacg gagtttcgct ctttttgccc aggctggagt 

4981 tmtnrtf, gatc " ggct caccacaacc tctgcctcct gaattciaac gStctSctg 
till otatt?ttra ^:r atCt gggattaca 3 gcatgcgcca ccacgccggc taattttttt 
5101 tagagacag 9 gtttctccat gttggccagg ctggtctcga actcccgacc 

Sill Ictllllctt tftat^aat gg ?" cccaa a 9tgctggga ttactgaclt gagccalcgc 
5221 tcttaSr ZtJSZ I ttttcacaga gatgaggtct tgctatgttg cccacactgg 
5281 acatlaa^f ™22^ gtgatcttcc tgccttggtc tcccagtgtt gggattatag 
5341 ?a?ta a cto a ofS"" c ? ggcagtCc "tctggggt gattagaagt tgggaccatl 
54oJ atcatcaafn eSSE?! 00 attataaaca cctatggtca ctgtcctggc aaaacatggl 
5461 ™^ aaa ? ctcatctaac cagagtgcag ttaataacca ggaagtaagc aagagaaaga 
552^ 99<=agtcaaa acagatttga caggccaagt cagaEcctcc tclgiacglg 

558? tta? a ? ga ^ aaataaagac a ggattgcca taatgcctct gtgctaaaag cttltcttg? 
5fi4i "acttaaat aaagggagtg cccctcaggt cttgagtaag agcttgctga catcaccctc 
llOl ttactcctaa ^ tCtC " gt "ctaaccct gtgttlgaag clgtafcala gaagatttag 
llll tacaa-aS nt 9 ^ 399 agctatt 9 tc taagagatac aaaggagaaa aaagtatacc 
5851 nfn? 9 .* 9 gatatcacct ctggggctgc caccacatca cctcactacg ccctgagggg 
llll atoS taga «aagtt ccaaatcttt tgcaaattaa acaaccccag gtcaggc??g 
llll cSaat a ta aa S ca 9= a ^ttg gggggctgag gtgggtgga? ?acc?gagg? 
6001 ^ a ? 9a ?^= 9 agacca gcct ggccaacaga gcaaaacccc atctctacta aacaaaatac 
foil 11 caggcgtagt ggtgtgcacc tgtagtccca gctacttggg aggctgaggc 

6U1 c ~ ' ="gagtcca ggaggccgaa gttgcagtaa Iccgaga??g cglcactg?! 
6181 VLltlllll 9 9gtgacaga 9 tgagactcca tttcaaaaaa taaaaacaac aaaagccaat 
6241 a 3 ^?^ acaacaaa aa aacaacgaat taaacaaccc caaagattgc acaaitttca 
6241 agtatcttta gaatatgttt tcagaaagcc tggcccatgg acatttttca acagcatctc 
6361 t?S ? tggaatggt gtgagtcaca caggcatggl tgagtcccac taafgcacat 
6421 aaa ™^ C ? M tcaccagccc caggtgccca ctcaagccca gctc?tagtg 

K4fti aggtt 5 ccct gactctctgg gcacttccac tcctaccaca cagggtagag ccacacccct 
65^ CtC ^ ggcagcatt attttgagag ccltSgctt? actgcacgtc 

6eSi tlaalactat ? ^ggtccatga gcccctggtg ggaactttgt ctctggtLc 

6661 t^cS ^ 99a "^? 9 tggacaaggt gtctggagaa aaacaaactc ctccctggga 
lno\ tgcctgagct cccaggattc tagaaggtta gttttgcaaa cctttaaaga agggattttc 
6721 atcaaggggc ccacagatcc ttcattgagg tttatgagtc ccacatclaa gSIJggtgt 
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6781 ctatctacat cagattctct taaagtccat gatcctaaaa cagttaagaa ctaatgctgt 
6841 gagggcctct tcctgggtca aagccacagg gaacctgcca tgtggatgct gcagcggggt 
6901 gtggatcagc caggccgcct ttcactgtgt tctgttttcc c?crcag2tt Lgctccgcc 
7021 ™? aaaC actcattaaa cgcacttctc attttcctca tcataacatc tgcgtggggt 
708? ? C CC9 " gga ^gctagag aaaggagggg aaactgctca gtctgclgat 

7l« ™2? Wg agca9tcaaa taacaaaaac ctgagcatgc ctcttctccc tgccgac?tc 
7141 cacaaggaaa acaccgtcac caacgactgg attccagagg gggaggagga cgacgactat 
7261 agaagatatt "gtgaagac gacgactacl tlgacatlgt cgacfgtctg 

7321 f a I' cgacagactc tgatgtgagt gctgggaaca tcctccagct tSttcatggc 
7381 ^ a0 «I? tccagcgtct taacatcctc aacgccaagt tcgctttcaa cctctaccga 
7381 gtgctgaaag accaggtcaa cactttcgat aacatcttca tagcacccgt tggcatttct 
l\il att?ta^? 9 ?^ atgatt ^ "taggtctg aagggagaga cccatgaala a^cactcg 
7561 aaLr^ "aaagactt tgttaatgcc agcagcaagc atgaaatcac gaccattcaE 
n\%l 11 tc l ctt l c gtaagctgac tcatcgcctc ttcaggagga attttgggta cacactgcgg 
7621 tcagtcaatg acctttatat ccagaagcag tttccaatcc tgcttglctt caaaacEaal 
llll ?caala ga ^ a ^ taC " tg ^ tgaggcccag atagctgact tltcagaccc tgccttcata 
7801 a ^ * acaaccacat catgaagctc accaagggcc tcataaaaga tgctctggag 
7961 c ^ gc 5 accca gatgatgatt ctcaactgca tctacttcaa aggtaaglgl 

7861 cacctttaca gttctcacag caaacccaca acatactatt tttgtatgtg ggtagattga 
7981 toaaa™ f ^actgta gctataattt atccaggaaa actagacfcl lgat?gac?c 
loli c? 9aa t g ? gg ^ g ? gaa " ccaa 9 ct 9aa gtgacagtag catctgacac tSactgagcc 
8041 ctaactctgt gctttaacac agccttgtga ggtcatcact gttattagca tccccatttt 
Sill ataatccca^ atgaagtaaa aggatgggct gggcgcggtg gctclcgcct 

8221 ?^^^ ag cacttt g99 a ggccgaggca ggcagatcac ttgaggtcag gagttcgaga 
828^ ae™™ caacagac = a ^"tggtgaa aacctggctc tactaaaaat IcaaaaatL 
8281 gctgggcctg gcggtgggtg cctgtactcc cagctacttg ggaggctgag gcaggagaat 
SJS taoacaarao tggaaggcag Wtgcagt gagccgagal tgtglcaltg Sltfgcc 
8401 tggacgacag agtgagactc catctcaaaa aaaaaaaaaa aagaagtaaa acgatgctcc 
852^ aa ? g9C ^ CC agttattaa9 gggcagagcc aaagctgaac ccagggaggc calccctagc 
llll laaatlttr? ataatacaaa aactgtttta gcat^ggcc agcctgga?t 

8641 o 93 ?^ cttttccttt cccaattatc aataagcagg aatatagaca aaaggctaaa 
8701 * gtgaactat tcagcttgag cagctgacat tgacacctac aag?gctttt 

8761 aa 9 ?^ 3 ^ tt9aactact gggcaggtgg gatggagaaa taaattacta ttEclccagc 
8821 aa aoft»«» g ggctga9Cac aagggcactt tttaaggagg tcaccccaca cccatcacac 
8881 a t a t a ^ a99a C ^ t99aat cctaggaata aataagcatg gatttgtaaa atccaaacct 
till ttllttn^t atatcctcac ctggaccaga ccagaagaaa cctctacttt actctctaag 
8941 ctgagagtgt ggaaggggaa acacgaggaa tggttcggct tcaggactaa ttgcggtgac 
BOS! ~ C "ctctttgc caccaaggac taccaggtac ctglaaaggg cagtlctlgg 
9?21 tnt, " tttctgctag ttagctcccg tggttttata gcagcccagg cgaaggaagl 
Sill a? ^T 99 cttct9ttca gggaaagggg gccagagc?c c?cc?gatc? 

9241 llZtaatr™ £ gctct 3 t B "ttggctga ggcccctgca gctctacaag gcaggcattc 
9301 lltlrtlin?. g ' caagcagg gtcactctga cacccaggtt tccaccccaa ggcatggcac 
llll aa ™ g9 tcct 9tgggt ggaatcaaag gctgagttct aacaggcttg cggcagacac 
9361 acacacagag accacatgta catgatgaac acacatatcc ttttcattac aggttattag 
till E£2£™ ggaattgagc aaacaagagt ctaagcgctg gtttcaccac t??tcgtttg 
ISS SSSSSS ^ aa ? tCat tcaacatctc tatgactcag tttccttatc tttatfcacaj 
Hoi « ga ^ a ^^ cactct S aca gggccgaggg aagaaccata agcgatggca atgcaacaga 
9661 ?™ a » a 9 acaagagctc agcgaatttg agggaatgaa actgtagitt acaatactag 
I5S a 3 ^ 3 ^ 93 taaacatat 3 atattgttag tgacatttat tttacttcta ctagcaaati 
Till tcct™^ a99aCtga f "agaacagg ctggcagaag catttttggc agcatcaaag 
till t tactggtctg ttggagcccc ccaagtacac caaagagcct ctgcattagc 

9901 tZlZnllall ff^ 039999 caggca9a9a a gtacagcag tgagccatcc ctjcctgclt 
9901 ggaggtggag aaatgatcag gcatggtcag ttgacaatct cctaaacaca gtaacccgtg 

lSSi aatcactc^a acgtgcaaat flcttctgctt cctttcccca Latgagaa? 

10081 a?ae a r^ a »f^ 99gCat cacaa "9 at caaatgctag gagtacccaa tcattcatgg 
10141 aaggg9ac9a gtgtctagaa gtgtaatttt aatttcactt aatttcatat 

lo20i 9 f aa ==! ccattactaa ttttgttcta attttaatgt gataatcact ttgtaaagca 
lolll f C 9 aggcag9Ctc tcatgaggaa gtcagaagga aagaatccca agagacalgg 

" gacagctcca tccaaactga aagggccgtg attcccaaaa gagcaatttt gtccccaagq 
iSSi aaa^™ ct " tgg " g tcacaacctg gggggttgga gtlagcatta Itggtatc?! 
° gaagggggag gctggggatg ttgctaaaca ccctaccatg cacaoggcag cccacattac 

lliol caaactcaaa ^aLT" aaatgtCaaa aa tgctgag^ ttgJSK ctgggtgagg 

inlfii ^ agact f a 9g gagaagggaa tcgagcttca ctcacaggca ggcaggagct gtctggtact 

Iwll ccttaaacaa SEES* 06 tgctcatctc atcctggctg cLtlcccac cagc?lgaaa 

10621 ccttgaacaa gttacttcac ttctttgtgc ctctgtttcc tcatatgtaa aagagggata 



FIG. 34D 



SUBSTITUTE SHEET (RULE 26) 



WO 99/50454 



PCT/US99/06473 



88/97 



10681 
10741 
10801 
10861 
10921 
10981 
11041 
11101 
11161 
11221 
11281 
11341 
11401 
11461 
11521 
11581 
11641 
11701 
11761 
11821 
11881 
11941 
12001 
12061 
12121 
12181 
12241 
12301 
12361 
12421 
12481 
12541 
12601 
12661 
12721 
12781 
12841 
12901 
12961 
13021 
13081 
13141 
13201 
13261 
13321 
13381 
13441 
13501 
13561 
13621 
13681 
13741 
13801 
13861 
13921 
13981 
14041 
14101 
14161 
14221 
14281 
14341 
14401 
14461 
14521 



acaaaacgca 
ctgagaagaa 
tactcacatc 
catatctaca 
tacagaatca 
ctcatgccta 
ggtttgagac 
aaagacagaa 
gggaggctga 
tggcatcgcc 
taataacagt 
tgaagaaata 
cctactccag 
caaatttctg 
tatctgaatg 
aaggaacctt 
tcccagtgga 
tttccatgat 
acatcctcca 
tgtctgggat 
aaagcatgac 

ggggtgtctg 

gtacccaaga 
ataagagatg 
gaagttagag 
gcttcatcat 
acagaatcag 
aagattggct 
cacatcattc 
taagtactta 
gtcttactgg 
aggacaagaa 
tttgcccggg 
acctgcctcc 
tttttttttt 
catcatcatg 
ctcctgaata 
ggtagagatg 
tcctcctgcc 
ccatttgact 
agaagatcaa 
gcaaagtgcc 
gtaagttgag 
tttaaaagta 
aaattgggaa 
aagttctgct 
tgcaacagaa 
aatcctcaac 
aaatataacc 
taaaactggg 
caagctggag 
gtttgacaaa 
aaccactccc 
ccacttgccc 
tcttcggcct 
ggcacctggc 
cactcccgct 
ccagccaaat 
aatcgggtcg 
ccccatcccg 
ccagccccca 
ctctaagtgc 
ccagctgtga 
tgtgctggga 
aaacagttca 



cacaacttgc 
tgcccggcac 
ttagagctaa 
gtggtgatcc 
cagtgtgagg 
taatcccagt 
cagcctaggt 
agaaaaaata 
ggcaggagga 
gcactccagc 
aataaaagct 
gaagcgagtt 
aaactattcc 
cccaaatcag 
aggcctccag 
ctcataacag 
aatgacacac 
gcagaccaag 
gctggaatac 
gaagaccctc 
aaacaggtat 
ggaatactgg 
acttccatac 
attagagagc 
gcagatgact 
ccctaaaatg 
cgatgctgag 
caactcttcc 
atgatttcct 
ttgagattat 
atactggcta 
tacaaacata 
tagccagtca 
ttccattcct 
ttttttttga 
gctcactgca 
gttgagacta 
aggtctcgct 
ttggccttcc 
tttaattgag 
gccttcctgc 
agactaactc 
gcaaagattg 
cactaccaga 
accaaaccag 
gctaaccttg 
aacacacctc 
tgacagtccc 
cgtggccctt 
cccccctttc 
aagaactaca 
aatggcaaca 
ttgtccaccc 
ttcctaccca 
gggtgggata 
agacacttac 
gacaccagag 
catgaaagag 
ctcagcaaaa 
gagaagtgcg 
cgaccctcag 
aacggctgcc 
tttccacctt 
actctagccc 
agcaccaagg 



atgttgctag 
atggccagtt 
catagacatg 
taagggcaac 
gatgaaggcc 
gctttggaag 
aacatagcaa 
gccaggcgtg 
ttccttgagc 
ctgcatgaca 
ggaaagagct 
aggtgcctta 
agtccgggta 
gcctcaggaa 
ggaaatcaga 
cctcttcctg 
aaccacaact 
gggaacttcc 
gtggggggca 
gaagcgcaac 
ttcacactgt 
aaaatggatc 
agggccactc 
attcataagg 
tagagacagc 
ggtataattc 
cgcccctccc 
ctgcccagga 
ctattattat 
tattgggtca 
ggcccatatg 
tgcaaccaaa 
tcatgctctg 
ccctgcagcc 
gacagggtct 
gcctcaacct 
caggcgtgca 
gtgttgccca 
aaagtgctgg 
atcttacttg 
ccatccagct 
cacaggcact 
agatattcag 
tattcgactc 
agaattattt 
aagataggaa 
agttttcagt 
ggaatataaa 
taaagggaaa 
cttttctgtc 
atctagtgga 
tggcaggcat 
ccgacccgtc 
ccccccaatc 
cacagaatgc 
tgggcagggg 
acaggggaga 
ccattaaaca 
gagagagaac 
cagcagtgtg 
accacaggca 
cctgacaggt 
acatgttgtc 
tctgtgtgct 
cacgatcaca 



gagcagaaat 
ctcaactact 
ggcttattcc 
atggcatcac 
atcaagacag 
gctgaggcag 
gaccccatct 
gcatgtgctt 
ctgggagtgt 
cagtgagacc 
caaagttact 
ccatggtcaa 
acctctcgtt 
tcaagagact 
ttcactctca 
tggcctttac 
tccggctgaa 
tcgcagcaaa 
tcagcatgct 
tgacaccccg 
gtgtttgttc 
atttttttaa 
tgttaattca 
gacacatctg 
ttggtgcttg 
cattacttcc 
agtacttgga 
aattccaagg 
tcgttacttt 
tggcagaaag 
aagaagtgat 
ctgagaaaag 
tgaatttttc 
cggcagctct 
tgttctgtca 
cctgaactta 
ccttcatgcc 
ggctggtctt 
gattaacagg 
gtgcaaggta 
gggattgcac 
actgttgcta 
cattgtctag 
cttaattaca 
tagatgcctt 
acgaaccata 
agcggaatta 
ttttaataag 
atcatgattc 
tagaactcga 
gtccctgaag 
ctcagaccaa 
cccagggtct 
tcatgtccca 
ctagtttcat 
ggatcccaag 
catgtgctgc 
ccgcactata 
accagtccaa 
gggagctgga 
ctgccaagag 
ggtgacagat 
tttggatcct 
gacctccaga 
gtgaacgagg 



gagataatac 
agtcacccat 
tggatacaca 
ccaaatgtct 
agctgaggct 
gaggattgct 
acaattaaaa 
gtagtccaag 
gaggctgcag 
tggtctcaaa 
catttgacag 
acaactagtt 
aacctctctt 
gtggggtcgg 
agggtgagac 
aggatcctgg 
tgagagagag 
tgaccaggag 
aattgtggtc 
ggtggtggag 
ttttgagctc 
aaagggagaa 
gccccaattt 
ccctctaggg 
ctttgtggct 
ccgggtcact 
acctaggagg 
tcctcttagc 
gtagttaaaa 
aatggagagg 
tctggtttga 
taggctctca 
cttaacaacg 
tgagaaaggg 
cccaggctgg 
agtgatcctc 
cagctaatta 
gaactcctgg 
cgtgagccgc 
tgagctaggt 
cttaaatctc 
tccgccccct 
tatatacagg 
aaaaaaaaac 
tttaaaccat 
cagtctcaag 
caaaggagtg 
tgctatatca 
ttttgtaact 
gaagtgcttc 
ttgatgggga 
aggatcgcca 
gcctcagcac 
gcttggggtg 
ggatgccagc 
agcagccatg 
ggtctgggaa 
caacatactt 
acagtgcagc 
gctggggtgg 
ggaacatgaa 
attttcaaga 
ttccctgaat 
atctgacaac 
aaggcaccca 



aggaaaggtg 
tactattagt 
gcactgtccc 
tgttagtcac 
ggcagggtgg 
tgaggccaag 
aaaaaaaaaa 
ctactgggga 
tgagctatga 
aaccaaataa 
atgtgacaga 
cgtatcagac 
gttagaaatg 
ctctgcaggc 
gatttcccta 
gtgaataaat 
gtagttaagg 
ctggactgcg 
ccacacaaga 
agatggcaaa 
ccagatgctg 
ttatgtacaa 
gttgcttgag 
gccagtttca 
tcgagtccca 
tgagaaaata 
cactcaaaaa 
ctaccgagga 
ctgcaggtgt 
tcttatttct 
acctccttat 
gaggaaggta 
tcccttctgt 
actgcatctt 
agtgcagtgg 
tcacctcagc 
aacttttttt 
cctcaagcag 
tgtgcctggc 
aaaagagtga 
tttatcccct 
tagggattga 
aaaggttctt 
caaatgccta 
aaaccaggaa 
gaaataatca 
tgcttcctaa 
attctgtgat 
tgtggttcaa 
tgccgaaatt 
tcaggatgct 
tcgacctggt 
agccccacct 
ctgagtctgc 
tggagagcac 
gggtgagccc 
atagctaccc 
aacttaaacc 
agacccagtt 
ctgtcctgca 
cctagccggc 
gtgactctga 
gatatgagat 
tttcctttcc 
agccaccact 
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14581 gtgaccacgg tggggttcat gccgctgtcc 
14641 tttcttttcc tcatctacga gcatcgcacc 
14701 aaccccagca ggtcctagag gtggaggtct 
14761 attttgtttc cattccaaca acgagaacag 
14821 gctaccaatc tgaattcgag gcccatatga 
14881 cttgttggaa tcaattctgc acaatagccc 
14941 tgtagtgtgt ctgctgttac ctagagggtc 
15001 gcagcgcgtc ctaagcacct cccgctccgg 
15061 actcaagcct ttctccacca ggcccctcat 
15121 gactaattcc ttacctctcc caaggagggt 
15181 gaagaagcca cctcaagaca tatgaggggt 
15241 tcaaagcctg acctttcaaa tccatgatga 
15301 ctgtgacctg gaggacagtg tgtgccatgt 
15361 acatxtactg tgtatctgtt ataattctct 
15421 atccaaattc ctggataact ccaggtatga 
15481 acaatgtgcc acagcagggc atgttctcag 
15541 agggtctgtg cagtacccca gaactgtggg 
15601 ccacagtcta tgccaggctg ctgcagcttt 
15661 tggcttgaca gagcagatga cacctgagga 
15721 aagacaagtg aaatccacag aggctgttca 
15781 aggggatgac tgacggtcac aggtgctgtg 
15841 ctggcagat 



acccaagtcc gcttcactgt cgaccgcccc 
agctgcctgc tcttcatggg aagagtggcc 
aggtgtctga agtgccttgg gggcaccctc 
agatgttctg gcatcatcta cgtagtttac 
gaggagctta gaaacgacca agaagagagg 
atgctgtaag ctcatagaag tcactgtaac 
tcacctcccc actcttcaca gcaaacctga 
tgaccccatc cttgcacacc tgactctgtc 
ctgaatacca agcacagaaa tgagtggtgt 
acacaactag caccattctt gatgtccagg 
gccctgggct aatgttaggg cttaattttc 
atgccatcag tccctcctgc tgttgcctcc 
ctcccatact agagataaat aaatgtagcc 
attttttgaa gctcaaatat caaaagccaa 
taaaggctga gaggaagtca cttgagcacc 
gacaggacag gtgtgtgctg aatcctgggg 
gtgctaagtg gcacacaagc cccagggctc 
catccctcat acctggtcct gcagtgggtc 
atatgtttct ggatccttca atccctgggt 
gcacgcaaga gtgccagtgc tctttcagtg 
tgtgcaggtg tctaactgta accccacagc 
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LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

novel 

JOURNAL 
MEDLINE 
FEATURES 

source 



10-OCT-1991 



HUMTHRR 3472 bp mRNA PRI 

Human thrombin receptor mRNA, complete cds. 
M62424 
g339676 

thrombin receptor. 
Human DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3472) 

Vu^T.H Hung , D . T . , Wheaton, V. I . and Coughlin, S.R. 

Molecular cloning of a functional thrombin receptor reveals a 

proteolytic mechanism of receptor activation 
Cell 64, 1057-1068 (1991) 
91168254 



Location/Qualifiers 
1. .3472 
/organism^Homo sapiens" 
/db_xref = - taxon : 9 6 06 " 
CDS 225.. 1502 

/ codon_s tar t = 1 

/product^* thrombin receptor" 
/db_xref=*PID:g339677" 

/ translation* "MGPRRLLLVAACFSLCGPLLSARTRARRPESKATNATLDPRSFL 
LRNPNDKYEPFWEDEEKl^SGLTEYRLVSINKSSPLQKQLPAFISEDASGYLTSSV^T 
LFVPSWTGVFWSLPLNIMAIVWILKMKVKKPAVVV^^ 

YY F SG SDWQFGSELCRFVTAAF YCNMYAS I LLMTVI S I DRFLAVW PMQSLSWRTLGR 
ASFTCrAIWAIAIAGWPLVLKEQTIQVPGLNITTCHDVMETLLEGYYAYYFSAFSA 
WFFVPLIISTOCYVSIIRCLSSSAVANRSKKSRALFLSAAVTCIFIICFGPTNVT^I 

AHYSFLSHTSTTEAAYFAYIJ^VCVSSISSCIDPLIYYYASSECQRYVYSILCCKESS 

DPSSYNSSGQLMASKMDTCSSNLNNSIYKKLLT " 
BASE COUNT 911 a an « not- _ . 



817 c 785 g 937 t 



BASE COUNT 933 a 

ORIGIN 

1 gcgcccgcgc gaccgcgcgc cccagtcccg ccccgccccg ctaaccgccc cagacacagc 
61 gctcgccgag ggtcgcttgg accctgatct tacccgtggg caccctgcgc tctgcctgcc 
f 2S*"** ccg gc * ccccgac ccgcagaagt caggagagag ggtgaagcgg agcagcccga 
5Ii f^ g f ? cctcccggag cagcgccgcg cagagcccgg gacaatgggg ccgcggcggc 
irn 5? CtgCtggt ggccgcctgc ttcagtctgt gcggcccgct gttgtctgcc cgcacccggg 
301 cccgcaggcc agaatcaaaa gcaacaaatg ccaccttaga tccccggtca tttcttctca 
361 ggaaccccaa tgataaatat gaaccatttt gggaggatga ggagaaaaat gaaagtgggt 
421 taactgaata cagattagtc tccatcaata aaagcagtcc tcttcaaaaa caacttcctg 
ill n!^ a ^° agaagat 9<=c tccggatatt tgaccagctc ctggctgaca ctctttgtcc 
Im ^ tg 5 gt f cacc 9gagtg tttgtagtca gcctcccact aaacatcatg gccatcgttg 
601 tgttcatcct gaaaatgaag gtcaagaagc cggcggtggt gtacatgctg cacctggcca 
661 cggcagatgt gctgtttgtg tctgtgctcc cctttaagat cagctaEtal ttttccggca 
781 Jtttgggtct gaattgtgtc gcttcgtcac tgcagcattt tactgtaaca 

III ^^ acg ^ ctc tftcttgctc atgacagtca taagcattga ccggtttctg gctgtggtgt 
lol gtccc ^ ctcc tggcgtactc tgggaagggc ttcSttcact Egtctggcca 

Izl ^ tgggctt * SSrccatcgca ggggtagtgc ctctcgtcct caaggagcaa accatccagg 
1021 caaca ^ cact acctgtcatg atgtgctcaa tgaaaccctg ctcgaaggct 

loll S ^ a ^°5 C ! gccttct ^tg ctgtcttctt ttttgtgccg ctgatcattt 
HA «^™^ 9 ttatgtg 5 ct atcattcgat gtcttagctc ttccgcagtt gccaaccgca 
1901 ?™~ gtC ccgg ^tttg ttcctgtcag ctgctgtttt ctgcatcttc atcatttgct 
1261 Itntnn^n ^°^ Ctc ctgatt ^gc attactcatt cctttctcac acttccacca 
1261 cagaggctgc ctactttgcc tacctcctct gtgtctgtgt cagcagcata agctcgtgca 
1321 tcgaccccct aatttactat tacgcttcct ctgagtlda gajgticgtc tlcag?a?ct 
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1381 tatgctgcaa agaaagttcc gatcccagca 
1441 gtaaaatgga tacctgctct agtaacctga 
1501 aggaaaaggg actgctggga ggttaaaaag 
1561 ttctattagt ccccacccaa actttattga 
1621 tgcatacctg ctttttatgg gagctgtcaa 
1681 aacaggacga gatgacggtg ttattccaag 
1741 aatgtcactt ctggatatag ctaggtgaca 
1801 tgtatgcaca cacatatatt atttgcagtg 
1861 ttccccgcac cccagcaatt atgaaaataa 
1921 ctaggttggt agagtttagc cctgaacatt 
1981 atagtttggg cttgtaccac ttttgcaaat 
2041 gtttaagtta ttaagaggta agacttagta 
2101 aattttaaac atatccaagt ttgaattcct 
2161 ttttgatatg ggtagtattt tttacatttt 
2221 ataagtcctc tagtgaatgt aggctggctt 
2281 tgtccgcccc cgatggagga ctccaggcag 
2341 gattggccag aaaccttcct gctgagcctc 
2401 ctccatcctc ctgggattgg ctgtgaactg 
2461 atgtgatatc ctaggaggta atgaccatga 
2521 aaagaaggca tggacttctg gatgcccatc 
2581 ctgaaatgtc agttctgata tggaagcacc 
2641 ctgagtgtac agagtggaat aagacagaga 
2701 tagagtgtga tgtatgtgta ataaatatgt 
2761 agtttgaaca tttgggttac tatttcttgt 
2821 aggacatata ttttttaaaa taagtctgat 
2881 ttgctcaata gattgctcaa atcaggtttt 
2941 agaaataaca gaagaaaata gaattgacat 
3001 catttactta agacttaatg agactttaaa 
3061 tagaaaatct tcatggaatt cacaaagtaa 
3121 tcttacgaaa aaatggtagc attttaaaca 
3181 taaaagagca ggccaggcgc ggtggctcac 
3241 ggcgggtgga tcacgaggtc aggagatcga 
3301 ctctactaaa aatgcaaaaa aaattagccg 
3361 tactcgggag gctgaggcag gagactggcg 
3421 cgagatcgcg ccactgtgct ccagcctggg 



gttataacag cagtgggcag ttgatggcaa 
ataacagcat atacaaaaag ctgttaactt 
aaaagtttat aaaagtgaat aacctgagga 
ttcacctcct aaaacaacag atgtacgact 
gcatgtattt ttgtcaatta ccagaaagat 
ggaatattgc caatgctaca gtaataaatg 
tatacatact tacatgtgtg tatatgtaga 
cagtatagaa taggcacttt aaaacactct 
tctctgattc cctgatttaa catgcaaagt 
tcatggtgtt catcaacagt gagagactcc 
aagtgtattt tgaaattgtt tgacggcaag 
ctatctgtgc gtagaagttc tagtgttttc 
aaaattatgg aaacagatga aaagcctctg 
acacactgta cacataagcc aaaactgagc 
tcagagtagg ctattcctga gagctgcatg 
cagacacatg ccagggccat gtcagacaca 
acagcagtga gactggggcc actacatttg 
atcatgttta tgagaaactg gcaaagcaga 
aagacttctc tacccatctt aaaaacaacg 
cactgggtgt aaacacatct agtagttgtt 
cattatgcgc tgtggccact ccaataggtg 
cctgccctca agagcaaagt agatcatgca 
ttcacacaaa caaggcctgt cagctaaaga 
ggttataact taatgaaaac aatgcagtac 
ttaattgggc actatttatt tacaaatgtt 
cttttaagaa tcaatcatgt cagtctgctt 
tgaaatctag gaaaattatt ctataatttc 
agcatttttt aacctcctaa gtatcaagta 
tttggaaatt aggttgaaac atatctctta 
aaatagaaag ttgcaaggca aatgtttatt 
gcctgtaatc ccagcacttt gggaggctga 
gaccatcctg gctaacacgg tgaaacccgt 
ggcgtggtgg caggcacctg tagtcccagc 
tgaacccagg aggcggacct tgtagtgagc 
caacagagca agactccatc tc 
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HUMLPLFI 3877 bp DNA pri 07- JAN- 199 5 

H. sapiens lipoprotein lipase <LPL) gene, exons 7,8, and 9, and an 
Alu repetative element. 
M76722 M76723 
gl87215 

Alu repeat; lipoprotein lipase; plasma protein. 
Homo sapiens blood DNA. 
Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3877) 

Chuat,J.C, Raisonnier,A. , Etienne,J. and Gal iber t , F . 
The lipoprotein lipase-encoding human gene: sequence from intron-6 
to mtron-9 and presence in intron-7 of a 40-million-year-old Alu 
sequence 

Gene 110 (2), 257-261 (1992) 
92165069 

Location/Qualifiers 
1. .3877 

/ organi sm= " Homo sapi ens " 
/db_xref = • taxon : 9 6 0 6 " 
/cell_type= " lymphocyte " 
/tissue_type= "blood" 
/map= ,, 8p22" 
1. .198 
/partial 
/gene="LPL" 
/note="G00-120-700" 
/number =6 

join(199. .319,1840. .2022,3052. .3156) 
/partial 
/gene="LPL" 
/codon_start=3 
/db_xref= n GDB:G00-120-70O B 
/product=" lipoprotein lipase" 
/db_xref="PID:g553523" 

/ translation^ - FHYQVKIHFSGTESETHTNQAFEISLYGTVAESENIPFTLPEVS 

TNKTYSFLIYTEVDIGELMLKLKWKSDSYFSWSDWWSSPGFAIQKIRVKAGETQKKV 
IFCSREKVSHLQKGKAPAVFVKCHDKSLNKKSG " 
exon 199.. 319 

/gene="LPL" 
/note="G00-120-700" 
/number =7 

gene join (199 . . 319 , 1840 . . 2022 , 3052 . . 3156) 

/gene="LPL" 
intron 320.. 1839 

/gene="LPL" 

/note="G00-120-700" 

/number =7 

repeat_region complement (746 . .1027) 

/gene= w LPL" 

/note= M G00-120-700" 

/rpt_family="Alu repeat" 
exon 1840.. 2022 

/gene="LPL" 

/note^GOO-120-700" 

/number =8 
intron 2023.. 3051 

/gene= B LPL" 

/note="G00-120-700" 

/numbers 8 
exon 3052.. 3156 

FIG. 36A 
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/gene="LPL" 

/note=-stop codon (tga) is interrupted by intron 9, 
between tg and a; GOO-120-700 B 
/number =9 
intron 3157.. 3877 

/partial 
/gene^LPL" 

/note="G00-120-700" 
/number= 9 

BAS^COUNT 1145 a 787 c 746 g 1199 t 

eJ acStaa ttcraro^ " a ^catga acactgtgca tgatgaagtc tttccaagcc 

i?i l» g ttccat 9 t 9t gtgcacttcc ggtttgagtg ctagtgagat acttctgtqg 

ill Itt^ l 9 cct ^ actatt t93ggttg t g atattttcat aaagaEtgat caacatgttc 

III nttl cccaacagtc ttccattacc aagtaaagat tcattttEct gggactgaga 

HI Sc a tccc attcactctg" Stable tttCtCtgta tggcaccgtg SSSSS 
361 ctcctaccat L„r^n„? tgagtagcac aggggggcgg tcatcatggc accagtccct 
421 cttlagttaa aaaaar™ 9 - "gagcagca gaagcagaga gcgatgccta gaaaacaagt 
481 aaaaafor? 3 93 atttcaaaa t tgaggtcttt cctctatttg atattgagaa 

til ?c a t a tltct gtttat??^ a "" atttt cacttactag ttatattttt ttatt?a?ca 
601 aaaagottto ™? " tttataaa 9C tgctgttaaa caacataatc aaaocatctc 
661 atatfftato tlcaaa^If 33atgagcaa tggtaacagg aaaccactcc atagatgtac 
721 tt a tt a ttt? t a t a ? a tttt ^^f 39 ^ agaa 9 tccat gacaaagtgt tagctctttt 
781 n™^,;;; Z fcttt tttfct gagat ggagtctctc tctattgccc aggctggagt 

11} ?" 9tgat ^ C 9 at ctcagct cactgcaacc tctacctccc gagttcaaac aattctlctg 
901 tattt?S f C9a9ta9Ct ^ggctgcag gtgcccacca ccatgcccag ctaatttttg 
961 cagqtaatcc ''^T tctcaccatg ttggccaagc tggtcttgal ttcctgatc? 
1021 ccSctac cctt?act* 9 gcc * cccaaa ^tgctgggat tacaggtgtg agccaccatg 
1081 aattlctaal taL^»f ^aatcaaaga aataaaagta aggcaacttg atacttttac 
1141 tgaacaaatc tttaaaaata gccagtgcag acaaggtggt gaagcagaac 

1201 S? 33 ^ accat 9 catc attcacggct agaaccctcc aggtgcggia ggtlgtattt 
llll tttatcctaa ITatttal? aaaatattat tacatagaag ggagfga?tt llttltlltt 
1321 actaacltaa 2fc£SS!£ C aacaaacatt tttaaaaaca tcaattacag tcgtacctat 
1381 ^tatraaa 33 =^f 9 f C ° ca 9 tatc <=aa cattgaggca gtgggtaaat gaatcgtggt 
1441 aacataa?aa ^f^ 3 atcta ^ cctt taaaaactat aattgtagga aaccclggaa 
1501 tgcta a aa aa EE ? 933 5 ataaaatct 9 aagagaataa agaatagaga atcgtalgtg 
1561 alatgalaat FEEEE' aat 9 ttcaa 5 tatcaacaca aattgaaaag gaatacafgf 
1621 taSatat L 9 aat 9 att 5 a c ttcaggattt tcttttagaa ttgtattala 

1681 cSac^r o^?? 9 ^ aat 9 ct 3gaa tgtggatata atttaaaata tactaaatgc 
1741 ISataaa E^E^T ^ cttt ^^ acatttttgt gcatttttaa aatatcccct 
1801 aaa h 33 3 octatttata tttggagagg agaaaaaaaa gtggggggca gggagagctg 
1*61 gacctactcc ttcctEE^ ttattgcttt tttgtttagg cc?gflgi!tt clacfaftaa 
1921 laaatoaaa^ *E« 33tt 5 acaca 9 a gst agatattgga gaactactca tgttgaagct 
1981 cattcfSaa £?EEEE acttta 9 ct 9 Stcagactgg tggagcagtc ccggcttcgc 
2041 cttcctfca? tttaE™ aagcagga 9 a gactcagaaa aagtaattaa atgtatttEt 
2101 tcacaattcl EEEEE ? acct 9 at 9t caggacctag gggctgtatt tcaggggcct 
2l61 tttaggagtc ttrttSOS £te?EEE * gtatttatt actgtatgat gtagattttc 
2221 ttatatttE rE=*™ ttcttatttt tggggggcgg ggggggaagt gacagtattt 
22B1 EEE tgtaaggaaa acataagccc tgaatcgctc acagttattc agtgagagct 
llll Igftaaaaaa f^,^ tca 9«tctc atttggcact gttLttgta agtLIalat 
2401 ccagafaaE "^ctccg agatgctacc tggataatca aagattcaaa ccaacctctt 
2461 cca?Sn « a ? 3 CC33 ^taatctca acctgtctcc gcagccccac ccatgtgtac 
2521 EEEE 9 aattacaca 9 agatcgctat aggatttaaa gcttttatac taaatgtgct 
llii llltitttlt EEEEf ^tgctgttat tgttaattta Laaaactct aagttSgat 
2641 alalaaaaaa EEE ? g * cattt 9 ct tgtatcacca aagaagcaaa caaacaaaca 
2701 caafatctaa ? a333agatC " g ^ afc 9g aaatgttata aagaatcttt tttacactag 
2761 attoltct 3 ? ?=? a 99 f 3g at 9 ccctaa t tccttaatgc agatgctaag agatggcagi 
2ali o«? 3 ^^ I tatcatctct tggtgaaagc ccagtaacat aagactgctc taggc£gtct 
28M ? ? ' ^ at f t3aat taacta g«tt ggttgctgaa caccaggtta gg«ctfaaa 
2 941 9 attct g afc gt ggcctgagtg tgacagttaa ctattgggaa tatcaaaaca 

2001 £S aStctlc 3tt3t " aaa cagtcctgac agaactgfac ctttgtgaac 
inKi 3 ?^ Z 9 att g ttcta c atggcatatt cacatccatt ttcttccaca gggtgatctt 
lltl atgccatalc Iaat aa ^ 9t Ctcattt 9 ca gaaaggaaag gcacctgcgg ?I?tfgtgaa 
3181 ctggaca? a c ^afan^ 93 ataa 3 aa 9 tc aggctggtga gcattctggg ctaaalc?ga 
jibi ctgggcatcc tgagcttgca ccctaaggga ggcagcttca tgcattcctc ttcaccccat 
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3241 caccagcagc ttgccctgac tcatgtgatc 
3301 ctgcatatgt atcaaatggg tctgttgctt 
3361 ctcttgtttc tcccagcccg gaccttcaac 
3421 ccttgaacta cccctgaatc ttcacttctc 
3481 tgcagatgcc atctgcagag catgtaacac 
3541 tgcagctctt cccaggatgt attcagggaa 
3601 cacatagttc ttgattctcc aagtgccagc 
3661 ccccaagcac ccattctcaa aaccctcaaa 
3721 gaaactgttc tctcttctat ctccaaacaa 
3781 ggctaatcca tgtggcagct gttagctgca 
3841 ctaagcatgt gaccttcact actcctgttc 



aaagcattca atcagtcttt cttagtcctt 
tatgcaatac ctcctctttt tttctttctc 
ccaggcacac attttaggtt ttattttact 
cttttttctc tactgcgtct ctgctgactt 
aagtttagta gttgccgttc tggctgtggg 
gtaaaaagat ctcactgcat cacctgcagc 
atactccggg acacacagcc aacagggctg 
gctgccaagc aaacagaatg agagttatag 
ctctgtgcct ctttcctacc tgacctttag 
tctttccaga gcgtcagtac tgagaggaca 
tgaattc 
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LOCUS 

DEFINITION 

ACCESSION 
NID 

KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



intron 

primer. 

gene 

exon 

CDS 



HSU59436 182 bp DNA PRI 19-JUN-1996 

Human low-density lipoprotein receptor (ldlr) gene, exon 12, 

partial cds. 

U59436 

gl381233 

human. 

Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata,- 
Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 182) 
Sibul,H. and Metspalu,A. 

A new polymorphism in exon 12 of the human low-density lipoprotein 
receptor (LDLR) gene P 
Unpublished 

2 (bases 1 to 182) 
Sibul,H. 

Direct Submission 

Submitted (29-HAY-1996) Hiljar Sibul, Estonian Biocentre, 
Biotechnology, Riia 23, Tartu, Estonia, 2400 

Location/Qualifiers 

1. .182 

/organism="Homo sapiens" 
/ db_xr e f = ■ t axon : 9 6 0 6 " 
<1. .25 
/gene="ldlr" 
/number=ll 
.bind 1..21 

/gene="ldlr w 
1. .182 

/gene="ldlr" 
26. .165 
/gene="ldlr° 
/number=12 

/product^" low-density lipoprotein receptor" 

<26..>165 

/gene="ldlr" 

/note= "LDLR" 

/codon_start=3 

/product=" low-density lipoprotein receptor" 
/db_xref="PID:gl381234» 



/translations 
variation 



primer_bind 
intron 



LLSGRLYWVDSKLHSISSIDVNGGNRKTILEDEKRLAHPFSLAV 
FE " 

replace (45, "t") 
/gene="ldlr" 
/ f requency= "0.17" 
complement (163 . .182) 
/gene= "ldlr" 



36 



166. .>182 
/gene="ldlr" 
/number =12 
a 53 c 



44 g 



49 t 



BASE COUNT 
ORIGIN 

1 tctccttatc cacttgtgtg tctagatctc ctcagtggcc gcctctactg ggttgactcc 
61 aaacttcact ccatctcaag catcgatgtc aatgggggca accggaagac catcttggag 
121 gatgaaaaga ggctggccca ccccttctcc ttggccgtct ttgaggtgtg gcttacgtac 
i o i ga 
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******** 
LOCUS 

DEFINITION 
ACCESSION 
NID 

KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

DE 



FEATURES 

source 



PRI 



06-OCT-1993 



5'UTR 
CDS 



HSCLA1GNA 2566 bp RNA 
H. sapiens encoding CLA-1 mRNA 
Z22555 
g397606 
CLA-1. 
human. 

Homo sapiens 

S^ Ch, ? ndr ^ eukaryotes; Metazoa; Chordata; 
T Xt £ltt & i ITTsllV PrlmateS; Cata " hi -'- Hoxninidae; Hon,o. 
Calvo,D. and Vega, M. A. 

S^'f 011 ; Primary structure, and distribution of CLA-1, a 
novel member of the CD36/LIMPII gene family 

93366811 <25) ' 18929 " 18935 (1993) 

2 (bases 1 to 2566) 
VEGA,M. 

Direct Submission 

Submitted (15-APR-1993) VEGA M. , HOSPITAL DE LA PRINCESA, UNIDAD 

BIOLOGIA MOLECULAR, C/ DIEGO DE LEON 62, MADRID, MADRID, SPAIN, 

Location/Qualifiers 
1. .2566 

/ organism= "Homo sapiens " 
/ db__xr e f = ■ taxon : 9 6 0 6 w 
/cell_type= B promyelocytes" 
/eel ine= " HL6 0 ■ 

/clone_lib="HL60 cDNA library, Angel L. Corbi" 
1 . . 69 

70. .1599 

/codon_start=l 

/product= "CLA-1 " 

/db_xref = n PID : g3 976 07 • 

/ trans 1 a t i on= " MGC SAKARWAAGALGVAGLLCAVLGAVMIVMVPSLI KQQVLKNV 

RIDPSSLSFNMWKEIPIPFYLSVYFFDVMNPSEILKGEKPQVRERGPYVYRESRHKSN 

ITFNNNDWSFLEYRTFQFQPSKSHGSESDYIVMPNILVLGAAVMMENKPMTLKLIOT 

I^TTLGERAFMNRTVGEIMWGYKDPLVNLINKYFPC^PFKDKFGLFAE 

FTVFTGVQNISRIHLVDKWNGLSKVDFWHS 

PEACRSMKLMxXESGVPEGIPTYRFVAPKTLFANGSIYPPNEGFCPCLESGlQNVSTC 
RFSAPLFLSHPHFLNADPVLAEAVTGLHPNQEAHSLFLDIHPVTGI PMNC SVKLQLSL 
YMKSVAGIGQTCKIEPVVLPLLWFAESGAMEGETLHTFYTQLVLMPKV^ 
LGCVXLLVPVICQIRSQEKCYL^SSSKKGSKDKEAIQAYSESLMT^ 

3'UTR 1600.. 2566 

polyA^site 2532 . . 2537 

BASE COUNT 528 a 811 c 

ORIGIN 

1 cgtcgccgtc cccgtctcct gccaggcgcg 
61 cgcgcagaca tgggctgetc cgccaaagcg 
121 gggctactgt gcgctgtgct gggcgctgtc 
181 cagcaggtcc ttaagaacgt gcgcatcgac 
241 gagatcccta tccccttcta tctctccgtc 
301 atcctgaagg gegagaagee geaggtgegg 
361 aggcacaaaa gcaacatcac cttcaacaac 

FIG. 38A 



695 g 532 t 



gagccctgcg 
cgctgggctg 
atgatcgtga 
cccagtagcc 
tacttctttg 
gagegeggge 
aacgacaccg 



agecgegggt 
ccggggcgct 
tggtgccgtc 
tgtccttcaa 
aegtcatgaa 
cctacgtgta 
tgtccttcct: 



gggccccagg 
gggcgtcgcg 
gctcatcaag 
catgtggaag 
ccccagcgag 
cagggagtcc 
cgagtaccgc 
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421 accttccagt tccagccctc caagtcccac 
481 aacatcctgg tcttgggtgc ggcggtgatg 
541 atcatgacct tggcattcac caccctcggc 
601 gagatcatgt ggggctacaa ggaccccctt 
661 atgttcccct tcaaggacaa gttcggatta 
721 ctcttcacgg tgttcacggg ggtccagaac 
781 aacgggctga gcaaggttga cttctggcat 
841 tctgggcaaa tgtggccgcc cttcatgact 
901 gaggcctgcc gatccatgaa gctaatgtac 
961 acctatcgct tcgtggctcc caaaaccctg 
1021 gaaggcttct gcccgtgcct ggagtctgga 
1081 gcccccttgt ttctctccca tcctcacttc 
1141 gtgactggcc tgcaccctaa ccaggaggca 
1201 acgggaatcc ccatgaactg ctctgtgaaa 
1261 gcaggcattg gacaaactgg gaagattgag 
1321 gagagcgggg ccatggaggg ggagactctt 
1381 cccaaggtga tgcactatgc ccagtacgtc 
1441 gtccctgtca tctgccaaat ccggagccaa 
1501 aaaaagggct caaaggataa ggaggccatt 
1561 gctcccaagg gctctgtgct gcaggaagca 
1621 cagccaggcc tggccgctgg gcctgaccgg 
1681 gactctccca gcagacagcc ccccagcccc 
1741 tgttgcacac ctgcacacac gccctggcac 
1801 acactcaggg atggagctgc tgctgaaggg 
1861 tgttctggaa ccttctctcc acgtggccca 
1921 gtccccttcc tcgggtgagc ctggcctgtc 
1981 ctccaaggtg aaacactgca gtcccggtgt 
2041 gggagtgccg ccttcctgtg ccaaattcag 
2101 gctttggcct tggtctacct gccaggccag 
2161 caatggagtg agcacaagat gccctgtgca 
2221 ggactttgat ccccccgaag tcttcacagg 
2281 ctccagccta aactgacatc atcctatgga 
2341 gcaggctgtg cccccgagct gcccccaccc 
2401 caggctgagg tgaagaggcc tgggggccct 
2461 aacctgtgac ccttttctac tggaatagaa 
2521 actcttgaag taataaacgt ttaaaaaaat 



ggctcggaga gcgactacat cgtcatgccc 
atggagaata agcccatgac cctgaagctc 
gaacgtgcct tcatgaaccg cactgtgggt 
gtgaatctca tcaacaagta ctttccaggc 
tttgctgagc tcaacaactc cgactctggg 
atcagcagga tccacctcgt ggacaagtgg 
tccgatcagt gcaacatgat caatggaact 
cctgagtcct cgctggagtt ctacagcccg 
aaggagtcag gggtgtttga aggcatcccc 
tttgccaacg ggtccatcta cccacccaac 
attcagaacg tcagcacctg caggttcagt 
ctcaacgccg acccggttct ggcagaagcg 
cactccttgt tcctggacat ccacccggtc 
ctgcagctga gcctctacat gaaatctgtc 
cctgtggtcc tgccgctgct ctggtttgca 
cacacattct acactcagct ggtgttgatg 
ctcctggcgc tgggctgcgt cctgctgctg 
gagaaatgct atttattttg gagtagtagt 
caggcctatt ctgaatccct gatgacatca 
aaactgtagg gtcctgagga caccgtgagc 
ccccccagcc cctacacccc gcttctcccg 
acagcctgag cctcccagct gccatgtgcc 
acatacacac atgcgtgcag gcttgtgcag 
acttgtaggg agaggctcgt caacaagcac 
caggctgacc acaggggctg tgggtcctgc 
ccgttcagcc gttgggccag gcttcctccc 
ggtggctccc catgcaggac gggccaggct 
tggggactca gtgcccaggc cctggcacga 
gcaaagcgcc tttacacagg cctcggaaaa 
gctgcccgag ggtctccgcc caccccggcc 
cactgcatcg ggttgtctgg cgcccttttc 
ctgagccggc cactctctgg ccgaagtggc 
cctcacaggg tccctcagat tataggtgcc 
gccttccggg cgctcctgga ccctggggca 
atgagtttta tcatctttga aaaataattc 
ggaaaaaaaa aaaaa 
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1. Claims: 1-12 (partially) 

INVENTION 1: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the AT3 gene 
and comprising a polymorphic (bi allelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



2. Claims: 1-12 (partially) 

INVENTION 2: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the CETP gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



3. Claims: 1-12 (partially) 

INVENTION 3: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the CLanalog 
gene and comprising a polymorphic (biallelic) site according 
to the nucleotide positions as indicated in the attached 
table - coluirei 4, an allele-specific oligonucleotide 
hybridizing to such a polymorphic site, an isolated gene 
product encoded by such a nucleic acid molecule, and a 
method of analyzing such a nucleic acid by determining the 
bases occupying the polymorphic site(s). 



4. Claims: 1-12 (partially) 

INVENTION 4: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the F2R gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



5. Claims: 1-12 (partially) 

INVENTION 5: A nucleic acid molecule of at least 5 
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nucleotides in length consisting of a part of the F2 gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



6. Claims: 1-12 (partially) 

INVENTION 6: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the F3 gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



7. Claims: 1-12 (partially) 

INVENTION 7: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the F5 gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



8. Claims: 1-12 (partially) 

INVENTION 8: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the HCF2 gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



9. Claims: 1-12 (partially) 

INVENTION 9: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the HMGCR gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
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- column 4, an allele- specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



10. Claims: 1-12 (partially) 

INVENTION 10: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the ITGA2B 
gene and comprising a polymorphic (biallelic) site according 
to the nucleotide positions as indicated in the attached 
table - column 4, an allele-specific oligonucleotide 
hybridizing to such a polymorphic site, an isolated gene 
product encoded by such a nucleic acid molecule, and a 
method of analyzing such a nucleic acid by determining the 
bases occupying the polymorphic site(s). 



11. Claims: 1-12 (partially) 

INVENTION 11: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the ITB3 gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



12. Claims: 1-12 (partially) 

INVENTION 12: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the LCAT gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



13. Claims: 1-12 (partially) 

INVENTION 13: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the LDLR gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
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such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



14. Claims: 1-12 (partially) 

INVENTION 14: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the LPL gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



15. Claims: 1-12 (partially) 

INVENTION 15: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the PROC gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



16. Claims: 1-12 (partially) 

INVENTION 16: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the PTAFR gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 



17. Claims: 1-12 (partially) 

INVENTION 17: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the TFPI gene 
and comprising a polymorphic (biallelic) site according to 
the nucleotide positions as indicated in the attached table 
- column 4, an allele-specific oligonucleotide hybridizing 
to such a polymorphic site, an isolated gene product encoded 
by such a nucleic acid molecule, and a method of analyzing 
such a nucleic acid by determining the bases occupying the 
polymorphic site(s). 
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18. Claims: 1-12 (partially) 

INVENTION 18: A nucleic acid molecule of at least 5 
nucleotides in length consisting of a part of the TBXA2R 
gene and comprising a polymorphic (bi allelic) site according 
to the nucleotide positions as indicated in the attached 
table - column 4, an allele-specif ic oligonucleotide 
hybridizing to such a polymorphic site, an isolated gene 
product encoded by such a nucleic acid molecule, and a 
method of analyzing such a nucleic acid by determining the 
bases occupying the polymorphic site(s). 
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